diff --git a/AGENTS.md b/AGENTS.md index 6e21f4476..fa8e345da 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -10,9 +10,10 @@ LLM-based agents can accelerate development only if they respect our house rules | Requirement | Rationale | |--------------|-----------| -| **British English** spelling (`organisation`, `licence`, *not* `organization`, `license`) except technical US spellings like `synchronized` | Keeps wording consistent with Chronicle's London HQ and existing docs. See the University of Oxford style guide for reference. | -| **ASCII-7 only** (code-points 0-127). Avoid smart quotes, non-breaking spaces and accented characters. | ASCII-7 survives every toolchain Chronicle uses, incl. low-latency binary wire formats that expect the 8th bit to be 0. | -| If a symbol is not available in ASCII-7, use a textual form such as `micro-second`, `>=`, `:alpha:`, `:yes:`. This is the preferred approach and Unicode must not be inserted. | Extended or '8-bit ASCII' variants are *not* portable and are therefore disallowed. | +| **British English** spelling (`organisation`, `licence`, *not* `organization`, `license`) except technical US spellings like `synchronized` | Keeps wording consistent with Chronicle's London HQ and existing docs. See the [University of Oxford style guide](https://www.ox.ac.uk/public-affairs/style-guide) for reference. | +| **ISO-8859-1** (code-points 0-255). Avoid smart quotes, non-breaking spaces and accented characters. | ISO-8859-1 survives every toolchain Chronicle uses. | +| If a symbol is not available in ISO-8859-1, use a textual form such as `>=`, `:alpha:`, `:yes:`. This is the preferred approach and Unicode must not be inserted. | Extended or '8-bit ASCII' variants are *not* portable and are therefore disallowed. | +| Tools to check ASCII compliance include `iconv -f ascii -t ascii` and IDE settings that flag non-ASCII characters. | These help catch stray Unicode characters before code review. | ## Javadoc guidelines @@ -26,12 +27,24 @@ noise and slows readers down. | Prefer `@param` for *constraints* and `@throws` for *conditions*, following Oracle's style guide. | Pad comments to reach a line-length target. | | Remove or rewrite autogenerated Javadoc for trivial getters/setters. | Leave stale comments that now contradict the code. | -The principle that Javadoc should only explain what is *not* manifest from the signature is well-established in the -wider Java community. +The principle that Javadoc should only explain what is *not* manifest from the +signature is well-established in the wider Java community. + +Inline comments should also avoid noise. The following example shows the +difference: + +```java +// BAD: adds no value +int count; // the count + +// GOOD: explains a subtlety +// count of messages pending flush +int count; +``` ## Build & test commands -Agents must verify that the project still compiles and all unit tests pass before opening a PR: +Agents must verify that the project still compiles and all unit tests pass before opening a PR. Running from a clean checkout avoids stale artifacts: ```bash # From repo root @@ -40,11 +53,18 @@ mvn -q verify ## Commit-message & PR etiquette -1. **Subject line <= 72 chars**, imperative mood: "Fix roll-cycle offset in `ExcerptAppender`". +1. **Subject line <= 72 chars**, imperative mood: Fix roll-cycle offset in `ExcerptAppender`. 2. Reference the JIRA/GitHub issue if it exists. 3. In *body*: *root cause -> fix -> measurable impact* (latency, allocation, etc.). Use ASCII bullet points. 4. **Run `mvn verify`** again after rebasing. +### When to open a PR + +* Open a pull request once your branch builds and tests pass with `mvn -q clean verify`. +* Link the PR to the relevant issue or decision record. +* Keep PRs focused: avoid bundling unrelated refactoring with new features. +* Re-run the build after addressing review comments to ensure nothing broke. + ## What to ask the reviewers * *Is this AsciiDoc documentation precise enough for a clean-room re-implementation?* @@ -53,10 +73,18 @@ mvn -q verify * Does the commit point back to the relevant requirement or decision tag? * Would an example or small diagram help future maintainers? +### Security checklist (review **after every change**) + +**Run a security review on *every* PR**: Walk through the diff looking for input validation, authentication, authorisation, encoding/escaping, overflow, resource exhaustion and timing-attack issues. + +**Never commit secrets or credentials**: tokens, passwords, private keys, TLS materials, internal hostnames, Use environment variables, HashiCorp Vault, AWS/GCP Secret Manager, etc. + +**Document security trade-offs**: Chronicle prioritises low-latency systems; sometimes we relax safety checks for specific reasons. Future maintainers must find these hot-spots quickly, In Javadoc and `.adoc` files call out *why* e.g. "Unchecked cast for performance - assumes trusted input". + ## Project requirements -See the [Decision Log](src/main/adoc/decision-log.adoc) for the latest project decisions. -See the [Project Requirements](src/main/adoc/project-requirements.adoc) for details on project requirements. +See the [Decision Log](src/main/docs/decision-log.adoc) for the latest project decisions. +See the [Project Requirements](src/main/docs/project-requirements.adoc) for details on project requirements. ## Elevating the Workflow with Real-Time Documentation @@ -84,7 +112,7 @@ This tight loop informs the AI accurately and creates immediate clarity for all When using AI agents to assist with development, please adhere to the following guidelines: -* **Respect the Language & Character-set Policy**: Ensure all AI-generated content follows the British English and ASCII-7 guidelines outlined above. +* **Respect the Language & Character-set Policy**: Ensure all AI-generated content follows the British English and ISO-8859-1 guidelines outlined above. Focus on Clarity: AI-generated documentation should be clear and concise and add value beyond what is already present in the code or existing documentation. * **Avoid Redundancy**: Do not generate content that duplicates existing documentation or code comments unless it provides additional context or clarification. * **Review AI Outputs**: Always review AI-generated content for accuracy, relevance, and adherence to the project's documentation standards before committing it to the repository. @@ -122,8 +150,7 @@ Date:: YYYY-MM-DD Context:: * What is the issue that this decision addresses? * What are the driving forces, constraints, and requirements? -Decision Statement:: -* What is the change that is being proposed or was decided? +Decision Statement :: What is the change that is being proposed or was decided? Alternatives Considered:: * [Alternative 1 Name/Type]: ** *Description:* Brief description of the alternative. @@ -151,7 +178,7 @@ Notes/Links:: Do not rely on indentation for list items in AsciiDoc documents. Use the following pattern instead: ```asciidoc -section:: Top Level Section +section :: Top Level Section (Optional) * first level ** nested level ``` @@ -159,3 +186,23 @@ section:: Top Level Section ### Emphasis and Bold Text In AsciiDoc, an underscore `_` is _emphasis_; `*text*` is *bold*. + +### Section Numbering + +Use automatic section numbering for all `.adoc` files. + +* Add `:sectnums:` to the document header. +* Do not prefix section titles with manual numbers to avoid duplication. + +```asciidoc += Document Title +Chronicle Software +:toc: +:sectnums: +:lang: en-GB +:source-highlighter: rouge + +The document overview goes here. + +== Section 1 Title +``` diff --git a/LICENSE.adoc b/LICENSE.adoc index eb12fcc48..f93a31eb3 100644 --- a/LICENSE.adoc +++ b/LICENSE.adoc @@ -1,4 +1,3 @@ - == Copyright 2016-2025 chronicle.software Licensed under the *Apache License, Version 2.0* (the "License"); diff --git a/README.adoc b/README.adoc index c3a8852f7..3b0069116 100644 --- a/README.adoc +++ b/README.adoc @@ -1,9 +1,11 @@ = Chronicle Threads Chronicle Software -:css-signature: demo :toc: macro -:toclevels: 2 :icons: font +:lang: en-GB +:toclevels: 2 +:css-signature: demo +:source-highlighter: rouge image:https://maven-badges.herokuapp.com/maven-central/net.openhft/chronicle-threads/badge.svg[caption="",link=https://maven-badges.herokuapp.com/maven-central/net.openhft/chronicle-threads] image:https://javadoc.io/badge2/net.openhft/chronicle-threads/javadoc.svg[link="https://www.javadoc.io/doc/net.openhft/chronicle-threads/latest/index.html"] @@ -79,6 +81,19 @@ public final class ExampleEventHandler implements EventHandler { call the `addHandler` method of the event loop, see also <> +== Links + +* link:src/main/docs/project-requirements.adoc[Project requirements (summary view)] +* link:src/main/docs/project-requirements.adoc[Project requirements (full specification)] +* link:src/main/docs/functional-requirements.adoc[Functional requirements catalogue] +* link:src/main/docs/architecture-overview.adoc[Architecture overview] +* link:src/main/docs/operational-controls.adoc[Operational controls] +* link:src/main/docs/thread-safety-guide.adoc[Thread-safety guide] +* link:src/main/docs/thread-security-review.adoc[Security review] +* link:src/main/docs/thread-performance-targets.adoc[Performance targets] +* link:src/main/docs/decision-log.adoc[Decision log] +* link:systemProperties.adoc[System properties reference] + [source,java] ---- el.addHandler(eh0); @@ -155,6 +170,51 @@ determines which of its child event loops the `EventHandler` is installed on. The second use of `HandlerPriority` is to enable each (child) event loop to determine how often each `EventHandler` is called e.g. `HandlerPriority.HIGH` handlers are executed more than `HandlerPriority.MEDIUM` handlers. +An `EventGroup` is configured with a set of supported handler priorities (by default all priorities). +Loops are only created for enabled priorities and attempts to add a handler whose priority has no corresponding loop result in an `IllegalStateException`. +If the configured priority set is empty, only the monitor loop is present and handlers with any other priority are rejected. + +== Requirements and Decisions + +The functional requirements for Chronicle Threads are captured in link:src/main/docs/project-requirements.adoc[project-requirements.adoc]. +Architectural and operational decisions, including event-loop threading and disk-space monitoring policies, are captured in link:src/main/docs/decision-log.adoc[decision-log.adoc]. + +== Advanced Usage Example + +The following example shows how two `EventGroup` instances can be used to separate a latency-critical trading pipeline from slower operational tasks in the same JVM. + +[source,java,opts=novalidate] +---- +// Latency-sensitive trading group on isolated cores +EventGroup tradingGroup = EventGroup.builder() + .withName("trading-eg") + .withPauserMode(PauserMode.BUSY) + .withBinding("2-3") // bind to isolated cores for low jitter + .build(); + +tradingGroup.addHandler(new OrderBookHandler()); +tradingGroup.addHandler(new RiskCheckHandler()); + +// Operational / housekeeping group on shared cores +EventGroup opsGroup = EventGroup.builder() + .withName("ops-eg") + .withPauserMode(PauserMode.BALANCED) + .withBinding("4-5") // less aggressive CPU usage + .build(); + +opsGroup.addHandler(new MetricsExportHandler()); +opsGroup.addHandler(new DiskSpaceMonitorHandler()); + +tradingGroup.start(); +opsGroup.start(); + +// On shutdown +opsGroup.close(); +tradingGroup.close(); +---- + +The architecture overview (link:src/main/docs/architecture-overview.adoc[architecture-overview.adoc]) and operational controls (link:src/main/docs/operational-controls.adoc[operational-controls.adoc]) explain how to choose bindings, pausers and monitoring settings for such topologies. + == Pausers Chronicle Threads provides a number of implementations of the diff --git a/pom.xml b/pom.xml index a723efada..1aa2793ff 100644 --- a/pom.xml +++ b/pom.xml @@ -32,7 +32,7 @@ net.openhft third-party-bom - 3.27ea5 + 3.27ea7 pom import diff --git a/src/main/adoc/decision-log.adoc b/src/main/adoc/decision-log.adoc deleted file mode 100644 index 3a23806aa..000000000 --- a/src/main/adoc/decision-log.adoc +++ /dev/null @@ -1,151 +0,0 @@ -= Chronicle Threads Decision Log -:revnumber: 1.2 -:revdate: 2025-05-26 -:toc: -:source-highlighter: rouge -:lang: en-GB - -This log summarises notable architectural and technical decisions taken for the Chronicle Threads component. -Identifiers follow the Nine-Box taxonomy (e.g., FN-Functional, NF-Non-Functional, OPS-Operational) an outline of which is referenced in the project's `AGENTS.md` document. -Each decision record aims to provide context, the decision itself, its status, rationale, and consequences. - -== Decision Log Entries - -=== THR-FN-001: Event Loops are Single-Threaded for Handler Execution - -* Date: 2015-02-24 -* Context: -** A core requirement for Chronicle Threads is to simplify concurrent programming for event-driven systems and enable predictable low-latency performance for handlers. -** Traditional multi-threaded models for event handlers often require complex locking mechanisms, which can introduce contention, bugs, and performance bottlenecks in the "hot path". -* Decision Statement: -** Each `EventLoop` instance shall operate on a dedicated Java platform thread. -** All `EventHandler` instances registered with a particular `EventLoop` (excluding `BLOCKING` or `CONCURRENT` priority handlers which have their own threading model within an `EventGroup`) are executed serially by this single thread. -* **Alternatives Considered:** -** *Multi-threaded Event Loops:* -*** *Description:* Allow multiple handlers on the same loop to execute concurrently across a thread pool. -*** *Pros:* Potential for higher throughput on a single loop if handlers are independent and can truly run in parallel. -*** *Cons:* Requires handlers to be thread-safe, introduces complexity of locks or other concurrency primitives, makes latency less predictable due to potential contention, and increases the difficulty of reasoning about handler state. -** *Actor Model per Handler:* -*** *Description:* Each handler instance is an actor with its own queue and thread. -*** *Pros:* Strong isolation. -*** *Cons:* Higher resource consumption (threads, queues) and potentially higher message passing overhead compared to direct serialized invocation. -* **Rationale for Decision:** -** This model significantly simplifies `EventHandler` implementation, as developers do not need to manage thread-safety within a handler's interaction with its own state or other handlers on the same loop. -** It promotes predictable, low-jitter latency by eliminating lock contention in the critical execution path of handlers. -** Aligns with the common "thread-per-core" pattern used in low-latency systems. -* **Impact & Consequences:** -** Positive: -** Simplifies handler development and testing. -** Improves latency predictability for handlers on a given loop. -** Facilitates easier reasoning about application state managed by handlers. -** Negative: -** The throughput of a single `EventLoop` is limited by a single CPU core. -** A misbehaving (long-running or blocking) handler on a standard event loop can starve other handlers on the same loop. (This is mitigated by `EventGroup` allowing `BLOCKING` priority handlers to run in separate, dedicated thread pools). -* What are the trade-offs made? Single core throughput vs. simplicity and low-contention latency. -* **Notes/Links:** -*** This decision is fundamental to the design of `VanillaEventLoop` and `MediumEventLoop`. - -=== THR-NF-P-002: Default to Busy-Spin Pauser for Latency-Critical "Fast Threads" - -* Date: 2015-02-24 -* Context: -** For "Fast Threads" (latency-critical event loops, typically pinned to isolated cores), minimizing the time taken to react to new events is paramount. -** Traditional pausing mechanisms (yielding, sleeping) introduce variable and often significant delays when a thread needs to resume. -* Decision Statement: -** `EventLoop` instances configured for the highest performance (e.g., using `PauserMode.BUSY` or `PauserMode.TIMED_BUSY`) shall employ a busy-spinning (or near busy-spinning) `Pauser` strategy by default. -** Alternative pauser modes (`YIELDING`, `BALANCED`, `SLEEPY`, `MILLI`) remain available and are recommended for less latency-sensitive loops or systems with CPU core constraints. -* **Alternatives Considered:** -** *Yielding Pauser as Default:* -*** *Description:* `Thread.yield()` immediately when idle. -*** *Pros:* More CPU-friendly than busy-spinning. -*** *Cons:* Introduces higher and less predictable wakeup latency. -** *Sleeping Pauser as Default:* -*** *Description:* `LockSupport.parkNanos()` or `Thread.sleep()` immediately when idle. -*** *Pros:* Lowest CPU usage when idle. -*** *Cons:* Highest wakeup latency. -** *Adaptive Pauser (e.g., `BALANCED`) as Default for all:* -*** *Description:* Uses a mix of spinning, yielding, and sleeping. -*** *Pros:* Good balance for general use cases. -*** *Cons:* Not the absolute lowest latency for highly optimized fast threads compared to pure busy-spinning. -* **Rationale for Decision:** -** Busy-spinning keeps the thread's cache hot and avoids the overhead of context switching and OS scheduler-induced delays, leading to the lowest possible reaction time when an event becomes available. -* **Impact & Consequences:** -** Positive: -** Achieves minimal event processing latency (p99.99 and tail latencies) for designated fast threads. -** Predictable entry into event handler logic once an event is detected. -** Negative: -** Consumes 100% of a CPU core, even when idle. This necessitates careful system configuration, including CPU isolation (`isolcpus`) and ensuring enough cores are available for other system tasks and less critical threads. -** If not managed properly (e.g., too many busy-spinning threads for available isolated cores), it can degrade overall system performance. -* What are the trade-offs made? Lowest latency vs. high CPU consumption and need for careful system tuning. -* **Notes/Links:** -*** See `Pauser.java`, `BusyPauser.java`, and `PauserMode.java`. -*** Documented recommendation in `README.adoc` and `project-requirements.adoc` (R4-02). - -=== THR-OPS-003: Background Disk Space Monitoring - -* Date: 2018-05-11 -* Context: -** Applications using Chronicle Queue or other disk-persistent Chronicle components can fail or suffer data loss if the underlying disk storage runs out of space. -** Operations teams need early warnings to prevent such critical failures. -* Decision Statement: -** A background monitoring singleton (`DiskSpaceMonitor`) shall be provided within Chronicle Threads. -** This monitor will periodically check disk usage for paths associated with Chronicle components (e.g., queue directories when they are initialized). -** Warnings shall be logged via a `NotifyDiskLow` service (defaulting to `NotifyDiskLowLogWarn`) when free disk space drops below a configurable percentage threshold (system property: `chronicle.disk.monitor.threshold.percent`). -* **Alternatives Considered:** -** *No built-in monitoring:* -*** *Description:* Require users to rely solely on external, system-level disk monitoring tools. -*** *Pros:* Simplifies the library. -*** *Cons:* Less integrated; users might overlook setting up adequate external monitoring tailored to Chronicle's usage patterns. -** *More Aggressive Actions:* -*** *Description:* e.g., halt queue appenders if disk space is critically low. -*** *Pros:* Could prevent further writes that might lead to immediate crashes. -*** *Cons:* May be too intrusive for a general-purpose library; such policies are better implemented at the application level. The library's role is primarily to inform. -* **Rationale for Decision:** -** Provides a built-in, proactive layer of defense that is easy to enable. -** Logging warnings is a non-intrusive way to alert operations without unilaterally halting application functionality. -** The `ServiceLoader` mechanism for `NotifyDiskLow` allows for custom notification actions if needed. -* **Impact & Consequences:** -** Positive: -** Operations teams receive early warnings of potential storage exhaustion, allowing for proactive intervention. -** Reduces the risk of unexpected application failures due to full disks. -** Negative: -** The monitor consumes minimal system resources (one background thread, periodic I/O for disk checks). -** Effectiveness relies on logs being actively monitored or a custom `NotifyDiskLow` service being implemented for more active notifications. -* What are the trade-offs made? Built-in convenience and basic safety vs. reliance on external monitoring for comprehensive operational alerting. -* **Notes/Links:** -*** See `DiskSpaceMonitor.java`, `NotifyDiskLow.java`, `NotifyDiskLowLogWarn.java`. -*** Key system properties: `chronicle.disk.monitor.disable`, `chronicle.disk.monitor.threshold.percent`. - -=== THR-DOC-004: Nine-Box Taxonomy Tagging for Requirements and Decisions - -* Date: 2025-03-01 -* Context: -** A need for a consistent and structured way to identify, categorize, and trace requirements, decisions, and potentially other project artifacts (like tests or specific code modules). -** Previous or alternative projects might have used ad-hoc or purely sequential numbering, making it harder to understand the nature or scope of an item from its ID alone. -* Decision Statement: -** All functional requirements (in `project-requirements.adoc`) and architectural decisions (in this log) shall use identifiers prefixed with `THR-` followed by a tag from the Nine-Box taxonomy (e.g., `FN`, `NF-P`, `OPS`, `DOC`) and a sequential number (e.g., `THR-FN-001`). -** The specific Nine-Box tag definitions and usage guidelines are maintained in the project's `AGENTS.md` document. -* **Alternatives Considered:** -** *Simple Sequential Numbering (e.g., REQ-001, DEC-001):* -*** *Description:* Basic sequential IDs. -*** *Pros:* Very simple to implement. -*** *Cons:* Provides no information about the type or domain of the item from its ID. -** *Custom Categorization Scheme:* -*** *Description:* Develop a project-specific set of categories. -*** *Pros:* Could be perfectly tailored. -*** *Cons:* Requires effort to define and maintain; less transferable knowledge if team members work on other projects using different schemes. -* **Rationale for Decision:** -** The Nine-Box taxonomy (referenced from `AGENTS.md`) provides a pre-defined, reasonably comprehensive set of categories that are broadly applicable to software development artifacts. -** Using this existing scheme promotes consistency if it's adopted across multiple Chronicle Software projects. -** Tags in IDs offer immediate insight into the item's domain. -* **Impact & Consequences:** -** Positive: -** Improved traceability between requirements, decisions, tests, and potentially code. -** Easier for team members to understand the context of an identified item quickly. -** Facilitates better organization and searching of documentation. -** Negative: -** Requires team members to be familiar with the Nine-Box taxonomy and apply it consistently. -** Initial setup of the scheme and guidelines in `AGENTS.md`. -* What are the trade-offs made? Richer categorization and traceability vs. slight learning curve for the taxonomy. -* **Notes/Links:** -*** The Nine-Box taxonomy details are in `AGENTS.md`. diff --git a/src/main/adoc/project-requirements.adoc b/src/main/adoc/project-requirements.adoc deleted file mode 100644 index cd9f87894..000000000 --- a/src/main/adoc/project-requirements.adoc +++ /dev/null @@ -1,198 +0,0 @@ -= Chronicle Threads – Functional Requirements Specification -:revnumber: 1.0 -:revdate: 2025-05-25 -:toc: -:source-highlighter: rouge -:lang: en=-GB - -== 1 Purpose and Scope - -This document specifies the *functional requirements* for the -*Chronicle Threads* library (OpenHFT / Chronicle Software). -It is aimed at architects, developers, and performance engineers who -intend to use Chronicle Threads as the execution engine for -ultra-low-latency, event-driven Java systems. - -*Out of scope* are: -* Detailed Non-Functional Requirement (NFR) specifications beyond the key performance targets identified in Section 5. -* Licensing and commercial support details. -* The detailed product roadmap (though Section 8 outlines potential future enhancements). - -== 2 Definitions - -[cols="1,3"] -|=== -|*Term* |*Meaning* - -|*EventLoop* |A single-threaded loop that repeatedly invokes `EventHandler` instances. -|*EventHandler* |Application-provided component that implements `action()`; executed by an `EventLoop`. -|*EventGroup* |A container that manages one or more `EventLoop`s, potentially including a dedicated monitor loop. -|*Pauser* |Strategy object that controls how an idle loop waits (e.g., busy-spin, yield, sleep). -|*Fast Thread* |A thread—usually pinned to an isolated CPU core—running a latency-critical event loop. -|*Hot Path* / *Fast Path* |The code execution path that is performance-critical and frequently executed, where low latency and minimal overhead are paramount. -|*Loop-Block Monitor* |Background handler that measures `EventHandler` run-time and logs outliers exceeding a configured threshold. -|*NUMA* |Non-Uniform Memory Access; a memory architecture where memory access time depends on the memory location relative to a processor. -|*Chronicle Affinity* |A library used for managing thread affinity, allowing threads to be pinned to specific CPU cores. -|=== - -== 3 Overall Architectural Goals - -* Minimise **end-to-end latency** and **jitter** for event processing. -* Provide a **deterministic single-threaded** execution model for event handlers within an `EventLoop`, removing the need for locks in the hot path of handler logic. -* Allow **fine-grained control** of the latency versus CPU consumption trade-off. -* Integrate effectively with other Chronicle components (e.g., Chronicle Queue, Chronicle Map, Chronicle Wire, Chronicle Services, Chronicle Tune). - -== 4 High-Level Functional Requirements - -=== 4.1 Event‐Loop Lifecycle and Configuration - -. *Creation and Configuration* -*THR-FN-001* The library SHALL provide builder APIs (e.g., `EventLoopBuilder`, `EventGroupBuilder`) for fluent and immutable configuration of event loops and groups. -. *Start/Stop Operations* -*THR-FN-002* An `EventLoop` or `EventGroup` SHALL support idempotent `start()` and graceful `close()` operations. A graceful `close()` implies that active handlers are allowed to complete their current action, and associated resources are released. -. *Non-Restartable Loops* -*THR-FN-003* Attempting to `start()` an `EventLoop` or `EventGroup` that has already been `close()`d SHALL be rejected or have no effect, as loops are not designed to be restartable. - -=== 4.2 Handler Management - -. *Dynamic Registration* -*THR-FN-004* Clients SHALL be able to add an `EventHandler` to a running `EventGroup` at runtime. -. *Handler Priority* -*THR-FN-005* Each `EventHandler` SHALL declare a `HandlerPriority`. The system SHALL support a range of priorities influencing execution order and/or loop assignment. (Refer to `net.openhft.chronicle.core.threads.HandlerPriority` enum for the exhaustive list of priorities, e.g., `HIGH`, `MEDIUM`, `LOW`, `TIMER`, `BLOCKING`, `REPLICATION`, `CONCURRENT`, `MONITOR`, `DAEMON`). -. *Execution Contract* -*THR-FN-006* An `EventHandler`'s `action()` method MUST be invoked serially on the `EventLoop`'s dedicated thread. -*THR-FN-007* The `action()` method MUST return a boolean: `true` if useful work was done (suggesting the handler may have more immediate work), `false` otherwise. -. *Self-Deregistration* -*THR-FN-008* An `EventHandler` MAY request its own deregistration from the `EventLoop` by throwing `net.openhft.chronicle.core.threads.InvalidEventHandlerException`. -. *Error Isolation and Reporting* -*THR-NF-O-009* Unchecked exceptions thrown by an `EventHandler`'s `action()` method SHALL NOT terminate the `EventLoop` itself. The offending handler SHALL be removed, and the error SHALL be reported using the standard `net.openhft.chronicle.core.Jvm.warn()` logging mechanism by default. - -=== 4.3 Idle Strategy (Pauser) - -. *Supported Pauser Modes* -*THR-FN-010* The library SHALL ship with a set of standard `Pauser` strategies, configurable via `PauserMode` enum values, including at least: `BUSY`, `TIMED_BUSY`, `YIELDING`, `BALANCED`, `MILLI`, and `SLEEPY`. (Refer to `net.openhft.chronicle.threads.PauserMode.java` and `README.adoc` for details on each mode). -. *Custom Pauser Pluggability* -*THR-FN-011* Applications SHALL be able to supply custom `Pauser` implementations via programmatic configuration (e.g., through builder APIs like `EventGroupBuilder.withPauser(Pauser customPauser)`). -. *Adaptive Back-off Parameterisation* -*THR-FN-012* Adaptive pausers (such as `LongPauser`, which underpins modes like `BALANCED` and `SLEEPY`) SHALL allow parameterisation of their back-off behaviour, including aspects like minimum busy-spin duration, yield duration, and minimum/maximum sleep times. -. *TimingPauser Support* -*THR-NF-O-013* `TimingPauser` implementations (e.g., `LongPauser`, `BusyTimedPauser`) SHALL expose pause-related metrics (e.g., total time paused via `timePaused()`, pause count via `countPaused()`) and SHALL optionally throw `java.util.concurrent.TimeoutException` when a configured timeout duration is exceeded during a timed pause. -. *Zero-Allocation Hot Path* -*THR-NF-P-014* The hot path methods `Pauser.pause()` and `Pauser.reset()` for built-in pausers SHALL NOT allocate heap objects. - -=== 4.4 Thread Affinity and CPU Isolation - -. *CPU Affinity API* -*THR-FN-015* The library SHALL provide mechanisms to bind `EventLoop` threads to specific CPU cores, utilizing Chronicle Affinity. This SHALL be configurable via builder APIs (e.g., `EventGroupBuilder.withBinding(String affinity)`). -. *CPU Isolation Guidance* -*THR-DOC-016* Documentation (`README.adoc`) SHALL recommend OS-level isolation of CPU cores (e.g., using `isolcpus` on Linux) for `EventLoop` threads when latency-sensitive pausers (like `PauserMode.BUSY` or `PauserMode.TIMED_BUSY`) are used, to minimize jitter. -. *NUMA Awareness* -*THR-FN-017* Builders SHOULD allow configuration that facilitates pinning `EventLoop` threads to specific NUMA nodes. This is typically achieved via the `binding` string syntax provided to Chronicle Affinity, which can specify core layouts respecting NUMA topology. - -=== 4.5 Monitoring and Diagnostics - -. *Loop Block Monitoring* -*THR-NF-O-018* For an `EventGroup`, a dedicated monitor loop (e.g., `MonitorEventLoop`) SHALL, by default, measure the wall-clock duration of `EventHandler` invocations on other event loops within the group. -*THR-NF-O-019* If an `EventHandler`'s `action()` method duration exceeds a configurable threshold (defaulting to a value specified by `loop.block.threshold.ns`), the framework SHALL capture and log a stack trace of the event loop thread executing that handler. -. *Monitoring Configuration Toggles* -*THR-OPS-020* Loop block monitoring MAY be disabled globally via a system property (e.g., `disableLoopBlockMonitor=true`). The monitoring interval SHALL also be configurable (e.g., `MONITOR_INTERVAL_MS`). -. *Pauser Metrics Accessibility* -*THR-NF-O-021* Key metrics from `Pauser` instances, such as `timePaused()` and `countPaused()`, SHALL be programmatically accessible. Implementations of `PauserMonitorFactory` MAY provide handlers to monitor and log these. -. *Low-Overhead Monitoring* -*THR-NF-P-022* When all `EventHandler` invocations are within their execution thresholds, the loop block monitoring mechanism MUST impose negligible overhead (target < 1 us overhead per monitored loop per second, excluding logging actions if a threshold is breached). - -=== 4.6 Configuration and Deployment - -. *System Property Overrides* -*THR-OPS-023* Default behaviours and parameters (e.g., pauser modes, monitor intervals, logging thresholds) SHALL be overridable via JVM system properties. (Refer to `systemProperties.adoc` for a comprehensive list; Appendix B provides examples). -. *Programmatic Configuration Precedence* -*THR-OPS-024* Programmatic configurations provided via builder APIs SHALL take precedence over global system property settings. -. *Graceful JVM Shutdown Hook* -*THR-OPS-025* An `EventGroup` instance, when configured appropriately (e.g., via `((net.openhft.chronicle.core.io.AbstractCloseable) eventGroup).addShutdownHook(true)`), SHALL attempt to `close()` automatically on JVM exit. -. *JDK Compatibility* -*THR-NF-O-026* Chronicle Threads SHALL run on Java 11 LTS or newer. The library SHALL default to using platform threads. While aiming for compatibility with JDK Project Loom (virtual threads) for suitable use cases (e.g., blocking handlers not requiring core affinity), full support and affinity guarantees with virtual threads depend on JDK evolution and are subject to considerations detailed in Section 8 (Open Issues). - -== 5 Key Performance Targets -These non-functional targets guide the design and optimization of Chronicle Threads for ultra-low-latency scenarios. -They are primarily applicable when using performance-oriented pausers (e.g., `BUSY`) on suitably configured systems (e.g., with isolated cores). - -*THR-NF-P-027* **Latency:** Single-hop message processing through an event handler SHALL target <= 10 us at the 99.99th percentile on commodity x86_64 hardware with a busy pauser and isolated cores. -*THR-NF-P-028* **Jitter:** Peak-to-peak variation in handler execution time SHALL target <= 2 us under steady load conditions for well-behaved handlers. -*THR-NF-P-029* **Throughput:** A single "fast core" `EventLoop` SHALL be capable of processing >= 5 million simple (e.g., 64-byte payload) events per second. -*THR-NF-P-030* **Heap Allocation:** In the hot path of event processing (i.e., within the `EventLoop` and `EventHandler.action()` calls for common use cases), heap allocation SHALL target <= 0.1 Bytes per event on average. Pauser hot paths are covered by THR-NF-P-014. -*THR-NF-P-031* **CPU Utilisation:** -* `PauserMode.BUSY` SHALL consume 100% of its assigned (and ideally isolated) CPU core. -* Adaptive pauser modes (e.g., `BALANCED`, `SLEEPY`) SHALL reduce CPU consumption significantly (e.g., target < 20%) when the event loop is idle. - -== 6 Use-Case Scenarios - -=== 6.1 Matching Engine -A financial matching engine processes incoming orders and market data. -*Multiple* `EventHandler` instances (e.g., for order book management, risk checks, trade execution) share a `HIGH` priority `EventLoop` pinned to an isolated CPU core (Core 2). (Illustrates: THR-FN-005, THR-FN-006, THR-FN-015, THR-NF-P-027) -A `MEDIUM` priority `EventLoop` on a separate core (Core 3) handles journalling of trades and significant events to Chronicle Queue. -(Illustrates: THR-FN-005, THR-FN-015) -A `MONITOR` loop, possibly on a non-isolated core, supervises both application loops. -(Illustrates: THR-NF-O-018) - -=== 6.2 Bursty Telemetry Ingestion -An `EventLoop` configured with a `PauserMode.BALANCED` ingests UDP packets containing telemetry data. -The `EventHandler` parses these packets (e.g., using Chronicle Wire) and forwards them to a Chronicle Queue for downstream processing. -During off-peak hours, CPU usage for this loop drops significantly (e.g., below 5%) due to the pauser's adaptive back-off. -During bursts, it processes events with low latency. -(Illustrates: THR-FN-010, THR-FN-012, THR-NF-P-031) - -== 7 References -* Chronicle Threads `README.adoc` (Provides overview, usage examples, and pauser details) -* `systemProperties.adoc` (Comprehensive list of configurable JVM system properties) -* `net.openhft.chronicle.core.threads.HandlerPriority` Javadoc (Definitive list of handler priorities) -* `net.openhft.chronicle.threads.PauserMode` Javadoc (Definitive list of pauser modes) -* Chronicle Affinity library documentation (For details on CPU binding syntax and capabilities) - -== 8 Open Issues / Future Enhancements -This section lists areas identified for potential future development or requiring further investigation. -They are not committed functional requirements for the current version. - -* Support for **carrier-thread reuse** with JDK virtual threads while retaining affinity guarantees where possible. -* First-class asynchronous I/O helper components to better integrate frameworks like Netty or `java.nio.channels` directly with `EventLoop`s. -* Live reconfiguration of `Pauser` parameters via JMX or a similar management interface. -* Enhanced built-in metrics publication mechanisms beyond basic pauser counters and loop-block logs. - -== 9 Appendices - -=== A Builder Example - -[source,java] ----- -EventGroup eg = EventGroup.builder() - .withName("MatchingEngineGroup") // Sets the base name for the EventGroup and its child loops - .withLoopCount(2) // Example: might influence number of certain types of loops if applicable - .withPauserMode(PauserMode.BUSY) // Sets default pauser for core loops - .withPriorities(EnumSet.of(HandlerPriority.HIGH, HandlerPriority.MEDIUM, HandlerPriority.MONITOR)) // Specify which handler priorities this group will support - .build(); - -// Add application-specific handlers -eg.addHandler(new MatchingEngineOrderHandler()); // Assuming this implements EventHandler -eg.addHandler(new JournalWriterHandler()); // Assuming this implements EventHandler - -eg.start(); - -// finally -eg.close(); ----- - -=== B System Properties Quick Reference -This is a non-exhaustive list of key system properties. -For a comprehensive list, refer to the `systemProperties.adoc` document. - -[cols="1,3"] -|=== -|*Property* |*Effect / Default (Illustrative)* - -|`pauserMode` |Global override for default pauser selection (e.g., `busy`, `balanced`). Used if not specified by builder. -|`loop.block.threshold.ns` |Nanoseconds before a handler invocation is flagged as a block (Default: 100,000,000 ns = 100 ms). -|`MONITOR_INTERVAL_MS` |Sampling interval for the monitor loop (Default: 100 ms). -|`disableLoopBlockMonitor` |Set to `true` to disable loop block monitoring (Default: `false`). -|`eventGroup.conc.threads` | Default number of threads for `CONCURRENT` priority handlers (Default: Varies, e.g., CPU cores / 4). -|`chronicle.disk.monitor.disable` | Set to `true` to disable the disk space monitor (Default: `false`). -|`chronicle.disk.monitor.threshold.percent` | Disk usage percentage above which warnings are issued (Default: 5%). -|=== diff --git a/src/main/docs/architecture-overview.adoc b/src/main/docs/architecture-overview.adoc new file mode 100644 index 000000000..53d2022ef --- /dev/null +++ b/src/main/docs/architecture-overview.adoc @@ -0,0 +1,188 @@ += Chronicle Threads Architecture Overview +:toc: +:sectnums: +:lang: en-GB +:source-highlighter: rouge + +== Purpose and Scope + +This guide explains how Chronicle Threads composes event loops, handlers, pausers, and monitoring into a cohesive runtime so that engineers can reason about placement, affinity, and operational behaviour. +It complements the functional catalogue in `project-requirements.adoc` and provides concrete design cues for solution architects. + +Out of scope are detailed API signatures (covered by Javadoc), exhaustive configuration tables (see `systemProperties.adoc`), and test methodology (see `thread-performance-targets.adoc`). + +== Document Map + +Chronicle Threads’ documentation set is organised as follows: + +* link:project-requirements.adoc[project-requirements.adoc] – full THR requirements catalogue (functional, non-functional, operational, documentation). +* link:functional-requirements.adoc[functional-requirements.adoc] – tabular view of key `THR-FN-*` behaviours grouped by domain. +* link:thread-performance-targets.adoc[thread-performance-targets.adoc] – non-functional performance targets (`THR-NF-P-*`) and benchmark methodology. +* link:operational-controls.adoc[operational-controls.adoc] – operational safeguards, CPU isolation, monitoring thresholds, configuration governance. +* link:thread-security-review.adoc[thread-security-review.adoc] – security posture and threat model for handler admission, affinity and telemetry. +* link:thread-safety-guide.adoc[thread-safety-guide.adoc] – confinement rules, hand-off patterns and testing strategies for handlers. +* `README.adoc` – getting-started guide and illustrative code snippets. +* `systemProperties.adoc` – system property reference for Chronicle Threads. + +This architecture overview should be read alongside the requirements catalogue and then used as the primary entry point for design discussions. + +== Core Components and Packages + +Chronicle Threads exposes a small set of core types in `net.openhft.chronicle.threads`: + +* `EventGroup` / `EventGroupBuilder` – configure and manage related `EventLoop` instances, including core, blocking, timer and monitor loops. +* `EventLoop` implementations – `CoreEventLoop`, `MediumEventLoop`, `BlockingEventLoop`, `MonitorEventLoop` and `VanillaEventLoop` implement different scheduling and threading characteristics. +* `Pauser` and implementations – `BusyPauser`, `BusyTimedPauser`, `LongPauser`, `MilliPauser`, `YieldingPauser`, `TimingPauser` provide pluggable idle strategies. +* `PauserMode` – maps configuration-friendly names (for example `BUSY`, `TIMED_BUSY`, `BALANCED`) to concrete `Pauser` strategies. +* Monitoring utilities – `ThreadMonitor`, `ThreadMonitors`, `PauserMonitorFactory` and `NotifyDiskLow` integrate loop health and disk space checks into the runtime. +* Helper utilities – `Threads`, `EventLoops`, `EventHandlers`, `NamedThreadFactory` provide convenience operations for creating and managing threads and handlers. + +Responsibilities are deliberately narrow: event loops orchestrate handler execution, pausers decide what to do when idle, and monitoring utilities observe behaviour without enforcing business policy. + +== Event Loop Topologies + +Chronicle Threads organises work into named `EventLoop` instances that the `EventGroup` manages (THR-FN-001). +Each loop is single-threaded for handler execution (THR-FN-006) and is categorised by handler priority (THR-FN-005). + +.... + EventGroup + | + +-- CoreLoop[HIGH|MEDIUM] ---> fast path handlers (trading logic, matching) + | + +-- BlockingPool[BLOCKING] --> dedicated threads for I/O or storage waits + | + +-- TimerLoop[TIMER] ------> scheduled maintenance and time-based work + | + +-- MonitorLoop[MONITOR] -> observes loop-block latency and pauser metrics +.... + +Handlers attach to the loop whose priority matches their declared `HandlerPriority`. +An `EventGroup` materialises loops for the priorities included in its configured priority set and omits others; blocking and concurrent loops are only created when required by configuration. +If the configured priority set is empty only the monitor loop is present and attempts to add a handler with any other priority are rejected. +Applications can deploy multiple `EventGroup` instances in the same JVM to isolate subsystems whilst sharing pauser implementations. + +== Handler Lifecycle and Serial Execution + +Handlers are added at runtime via `EventGroup.addHandler()` (THR-FN-004). +The loop invokes each handler serially, ensuring stateful logic can remain lock-free (THR-FN-006). +The handler signals its progress via the boolean return value of `action()` (THR-FN-007). +Self-removal uses `InvalidEventHandlerException` (THR-FN-008); the loop removes the handler, logs through the standard `Jvm` channel, and continues running (THR-NF-O-009). + +Handlers should bound their execution time so that monitor loops can flag outliers reliably (THR-NF-O-018). +Long-running work belongs on the `BLOCKING` priority where independent threads handle it. +When reconfiguring a live loop, call `EventLoop.addHandler()` on the owning loop thread or rely on the concurrency-safe wrappers provided by `EventGroup`. + +== Pauser Strategy and Scheduler Interaction + +Pausers implement the idle strategy for each loop and are configured via builders or per-loop overrides (THR-FN-010, THR-FN-011). +Adaptive pausers expose tuning parameters that balance busy-spin and sleeping phases (THR-FN-012) while exposing metrics for observability (THR-NF-O-013, THR-NF-O-021). + +* `BUSY` / `TIMED_BUSY`: Bind to isolated cores, targeting nanosecond wake-up latency (THR-DOC-016). +* `BALANCED` / `SLEEPY`: Combine spin, yield, and park for mixed workloads. +* Custom: Provide a bespoke `Pauser` for domain-specific throttling. + +Hot paths avoid allocations (THR-NF-P-014) so a pauser change cannot introduce garbage. +Each loop records the time spent paused, supporting utilisation diagnostics. + +== Affinity and NUMA Alignment + +Affinity strings supplied via builders control how loops bind to hardware threads (THR-FN-015). +They accept the Chronicle Affinity syntax, including NUMA-aware layouts (THR-FN-017). +Example: + +---- +EventGroup eg = EventGroup.builder() + .withName("risk-eg") + .withBinding("0,2-3") + .build(); +---- + +* `0` binds the primary high-priority loop to core 0. +* `2-3` pins additional loops (e.g., MONITOR or BLOCKING) across cores 2 and 3. + +When multiple `EventGroup` instances coexist, coordinate bindings to avoid core contention. +Document selected affinities alongside deployment manifests so operators can validate CPU isolation. + +== Monitoring Plane + +Each `EventGroup` provisions a monitor loop that samples execution times and resets pausers at configurable intervals (THR-NF-O-018, THR-NF-O-019, THR-OPS-020). +The monitor loop: + +* Measures handler invocation duration, logging stack traces for breaches. +* Publishes pauser metrics through configured `PauserMonitorFactory` hooks. +* Responds to system properties that disable or tune monitoring (THR-OPS-023). + +The monitoring loop is not latency-critical but must keep pace with the core loops to avoid stale diagnostics. +Ensure JVM logging levels capture WARN messages from monitor handlers in production. + +== Performance Characteristics + +Chronicle Threads is designed to meet strict non-functional performance targets (THR-NF-P-014, THR-NF-P-027..THR-NF-P-031). +The detailed benchmarks and test methodology are documented in `thread-performance-targets.adoc`; this section summarises their architectural implications. + +Latency and jitter:: +Fast-path loops configured with busy pausers and isolated cores target single-hop handler latencies of <= 10 microseconds at the 99.99th percentile and tight jitter envelopes. +Architecturally this drives the use of single-threaded loops, busy-spin pausers, and affinity-aware deployment, as described in the sections above. + +Throughput and utilisation:: +On suitable hardware a single fast loop is expected to process millions of simple events per second whilst keeping CPU utilisation aligned with input rate. +Pauser strategies and monitoring hooks expose utilisation metrics so operators can confirm that loops saturate cores only when necessary. + +Allocation profile:: +The library aims for zero allocations in hot-path pauser calls and minimal allocations in event loop and handler invocation paths (THR-NF-P-014, THR-NF-P-030). +Design choices such as reusable exception instances, careful logging paths, and avoidance of per-iteration object creation follow from this requirement. + +Benchmarking and regression control:: +Performance contracts are enforced via dedicated benchmarks and soak tests described in `thread-performance-targets.adoc`. +Architectural changes that affect loop structure, pauser implementations or monitoring behaviour should be evaluated against these targets and referenced with the relevant THR identifiers in design discussions and pull requests. + +== Integration Touchpoints + +Chronicle Threads commonly underpins Chronicle Queue tailers, Chronicle Map maintenance tasks, and application-specific pipelines. +When integrating: + +* Use `net.openhft.chronicle.core.io.Closeable` semantics to align handler lifecycle with queue appenders or tailers. +* Combine telemetry exports with the monitor loop to funnel utilisation metrics to the estate-wide monitoring system. +* Align handler priorities with data criticality so that core loops handle order flow while auxiliary loops manage persistence, replay, or housekeeping. + +Refer to `README.adoc` for code-level examples, to link:operational-controls.adoc[operational-controls.adoc] for deployment-time safeguards, and to link:thread-safety-guide.adoc[thread-safety-guide.adoc] and link:thread-security-review.adoc[thread-security-review.adoc] for ownership and security guidance. + +== Configuration Overview + +Chronicle Threads is configured through a combination of builder APIs and JVM system properties. +Builders capture topology and pauser choices in code, while properties provide deployment-time overrides for monitoring and operational thresholds. + +Builder configuration:: +* `EventGroupBuilder` methods control loop naming, pauser selection (`withPauser(Pauser)` or pauser mode), affinity bindings and thread factory. +* Application-specific configuration classes typically own an `EventGroupBuilder`, apply environment-specific defaults and then expose a built `EventGroup` to the rest of the system. + +System properties:: +* Loop-block monitoring is governed by properties such as `loop.block.threshold.ns`, `MONITOR_INTERVAL_MS` and `disableLoopBlockMonitor` (described in `systemProperties.adoc` and `thread-performance-targets.adoc`). +* Disk space monitoring is controlled by `chronicle.disk.monitor.disable` and `chronicle.disk.monitor.threshold.percent`, as documented in `systemProperties.adoc` and decision THR-OPS-003. +* Pauser behaviour can be influenced by properties like `pauserMode`, `pauser.minProcessors` and `eventGroup.conc.threads`. + +Configuration precedence:: +* Code-level defaults in builders provide safe baselines for development. +* Environment-specific profiles (for example YAML, property files or Spring configuration) override builder defaults for certification and production. +* JVM `-D` flags are reserved for exceptional or per-node tweaks and should be documented in run-books to avoid drift. + +== Trade-offs and Alternatives + +Single-threaded loops versus thread pools:: +Chronicle Threads adopts single-threaded event loops for handler execution (THR-FN-001, THR-FN-006), favouring predictable latency and simple state management over raw parallel throughput. +The main alternative, a shared thread pool invoking handlers concurrently, offers more parallelism but at the cost of lock contention, more complex code and less deterministic tail latency. + +Busy-spin pausers versus balanced strategies:: +Busy pausers (for example `PauserMode.BUSY`, `PauserMode.TIMED_BUSY`) deliver the lowest wake-up latency but consume full cores (THR-NF-P-002). +Balanced or sleepy pausers reduce CPU usage by introducing yielding and sleeping phases at the expense of slightly higher wake-up times. +Operational controls and deployment artefacts should document which loops use which mode so that estate-level CPU planning reflects these trade-offs. + +Centralised monitoring versus minimal instrumentation:: +The built-in monitor loop and pauser metrics (THR-NF-O-018, THR-NF-O-021) provide rich observability but introduce a small amount of additional work per handler invocation. +Alternatives that rely solely on external monitoring simplify the event group but make it harder to diagnose loop-block events in context. +The chosen design keeps the monitoring plane lightweight while ensuring sufficient visibility for production incident response. + +Per-module thread groups versus JVM-wide executors:: +`EventGroup` instances encapsulate loop topology and pauser choices for a given subsystem instead of sharing a single global executor. +This increases configuration surface area but lets each subsystem adopt a topology tailored to its latency, throughput and isolation needs. +Applications can still compose multiple `EventGroup` instances to share underlying hardware where appropriate. diff --git a/src/main/docs/decision-log.adoc b/src/main/docs/decision-log.adoc new file mode 100644 index 000000000..101be452f --- /dev/null +++ b/src/main/docs/decision-log.adoc @@ -0,0 +1,229 @@ += Chronicle Threads - Decision Log +:toc: +:lang: en-GB +:source-highlighter: rouge + +This file captures component-specific architectural and operational decisions for Chronicle Threads. +Identifiers follow the THR--NNN pattern from the Nine-Box taxonomy (FN, NF-P, NF-S, NF-O, TEST, DOC, OPS, UX, RISK). +Numbers are unique within the THR scope. + +== Decision Index + +* link:#THR-FN-001[THR-FN-001 Event loops are single-threaded for handler execution] +* link:#THR-NF-P-002[THR-NF-P-002 Busy-spin pauser for latency-critical fast threads] +* link:#THR-OPS-003[THR-OPS-003 Background disk-space monitoring] +* link:#THR-DOC-004[THR-DOC-004 Nine-Box taxonomy for requirements and decisions] +* link:#THR-FN-005[THR-FN-005 EventGroup priority sets gate handler registration] + +== Decision Records + +[[THR-FN-001]] +=== THR-FN-001 Event loops are single-threaded for handler execution + +Date:: 2015-02-24 + +Context:: +* Chronicle Threads aims to simplify concurrent programming for event-driven systems while delivering predictable low-latency behaviour. +* Traditional multi-threaded handler models often require complex locking, which introduces contention, bugs, and unpredictable pauses on the hot path. + +Decision Statement:: +* Each `EventLoop` instance runs on a dedicated Java platform thread. +* All `EventHandler` instances registered with a given `EventLoop` (except handlers with `BLOCKING` or `CONCURRENT` priorities, which use their own threading model within an `EventGroup`) are executed serially by that single thread. + +Alternatives Considered:: +* Multi-threaded event loops:: +** Description: Execute handlers from a shared loop across a thread pool. +** Pros: Higher potential throughput if handlers are fully independent. +** Cons: Requires thread-safe handlers, more locking, less predictable latency, and harder reasoning about shared state. +* Actor model per handler:: +** Description: Each handler has its own queue and dedicated thread. +** Pros: Strong isolation between handlers. +** Cons: Higher resource usage (threads, queues) and higher message-passing overhead than serial invocation on one loop. + +Rationale for Decision:: +* Single-threaded loops make handler implementation simpler by removing most intra-loop locking requirements. +* They support low-jitter latency by avoiding lock contention in the main execution path. +* The model aligns with a common thread-per-core pattern used in low-latency systems. + +Impact & Consequences:: +* Positive: +** Simplifies handler development and testing. +** Improves latency predictability for all handlers on a given loop. +** Makes it easier to reason about state owned by handlers on one loop. +* Negative: +** Throughput of a single `EventLoop` is bounded by one CPU core. +** A misbehaving or blocking handler on a standard loop can starve other handlers on that loop. This is mitigated by using `EventGroup` with `BLOCKING` or `CONCURRENT` handlers on separate threads or thread pools. + +Notes/Links:: +* Core implementation: `VanillaEventLoop`, `MediumEventLoop`. +* Requirements: link:project-requirements.adoc[Handler management requirements, including THR-FN-006.] + +[[THR-NF-P-002]] +=== THR-NF-P-002 Busy-spin pauser for latency-critical fast threads + +Date:: 2015-02-24 + +Context:: +* For latency-critical fast threads (often pinned to isolated cores), minimising wake-up time when new work arrives is essential. +* Standard pause strategies that yield or sleep introduce variable and often large delays before a thread resumes work. + +Decision Statement:: +* `EventLoop` instances configured for maximum performance (for example using `PauserMode.BUSY` or `PauserMode.TIMED_BUSY`) use a busy-spinning or near busy-spinning `Pauser` by default. +* Other pauser modes (`YIELDING`, `BALANCED`, `SLEEPY`, `MILLI`) remain available and are recommended for less latency-sensitive loops or where CPU cores are constrained. + +Alternatives Considered:: +* Yielding pauser as default:: +** Description: Call `Thread.yield()` when idle. +** Pros: More CPU-friendly than pure busy-spin. +** Cons: Higher and less predictable wake-up latency. +* Sleeping pauser as default:: +** Description: Sleep or park immediately when idle. +** Pros: Lowest CPU usage when idle. +** Cons: Highest wake-up latency and jitter. +* Adaptive pauser as universal default (for example `BALANCED`):: +** Description: Combine spinning, yielding, and sleeping adaptively. +** Pros: Good general-purpose balance. +** Cons: Not as fast as a dedicated busy-spin for the most latency-sensitive loops. + +Rationale for Decision:: +* Busy-spinning keeps the thread hot on its core and avoids scheduler and context-switch overhead, giving the minimum wake-up latency when events arrive. +* Users can explicitly opt into less aggressive pausers when CPU utilisation must be reduced. + +Impact & Consequences:: +* Positive: +** Minimises p99.99 and tail latency for designated fast threads. +** Provides predictable entry into handler code once an event is visible. +* Negative: +** Consumes a full CPU core even when idle, so requires careful system configuration (isolated cores, enough headroom for other work). +** Misuse (too many busy-spinning threads) can degrade overall system performance. + +Notes/Links:: +* Key classes: `Pauser`, `BusyPauser`, `PauserMode`. +* Requirements: link:project-requirements.adoc[Idle strategy and key performance targets (THR-FN-010, THR-NF-P-027..THR-NF-P-031).] +* Additional guidance: `README.adoc` pauser section. + +[[THR-OPS-003]] +=== THR-OPS-003 Background disk-space monitoring + +Date:: 2018-05-11 + +Context:: +* Chronicle Queue and other disk-backed Chronicle components can fail or lose data if the underlying storage fills up. +* Operations teams need early, component-aware warning of low disk-space conditions to act before failure. + +Decision Statement:: +* Chronicle Threads provides a background monitoring singleton `DiskSpaceMonitor`. +* The monitor periodically checks disk usage for paths associated with Chronicle components (for example queue directories as they are initialised). +* When free space falls below a configurable threshold, a `NotifyDiskLow` service (default `NotifyDiskLowLogWarn`) logs warnings. The threshold is configured via system property `chronicle.disk.monitor.threshold.percent`. + +Alternatives Considered:: +* No built-in monitoring:: +** Description: Rely solely on external system-level disk monitoring. +** Pros: Keeps the library simpler. +** Cons: External monitoring might not be tuned to Chronicle usage; users can forget to set it up. +* More aggressive built-in actions:: +** Description: Automatically halt writers when space is critically low. +** Pros: Could prevent writes that would immediately fail. +** Cons: Too intrusive for a general-purpose library; better handled by application policy. + +Rationale for Decision:: +* A built-in monitor provides a lightweight, Chronicle-aware early warning without enforcing policy. +* Logging is non-intrusive and keeps responsibility for operational response with the application and operations teams. +* The `ServiceLoader` based `NotifyDiskLow` contract allows custom notification or escalation strategies. + +Impact & Consequences:: +* Positive: +** Gives early warning of storage exhaustion for Chronicle workloads. +** Reduces the risk of unexpected failures caused by full disks. +* Negative: +** Adds minimal overhead (one background thread plus periodic I/O). +** Effectiveness depends on log monitoring or a custom `NotifyDiskLow` implementation. + +Notes/Links:: +* Key classes: `DiskSpaceMonitor`, `NotifyDiskLow`, `NotifyDiskLowLogWarn`. +* Properties: `chronicle.disk.monitor.disable`, `chronicle.disk.monitor.threshold.percent` (see also link:../../systemProperties.adoc[system properties table].) + +[[THR-DOC-004]] +=== THR-DOC-004 Nine-Box taxonomy for requirements and decisions + +Date:: 2025-03-01 + +Context:: +* The project needs a consistent, structured way to identify and trace requirements, decisions, tests, and possibly code artefacts. +* Earlier schemes using ad-hoc or simple sequential numbering made it difficult to infer scope or intent from an identifier alone. + +Decision Statement:: +* All functional requirements (in `project-requirements.adoc`) and architectural decisions (in this log) use identifiers prefixed with `THR-`, followed by a Nine-Box tag (for example `FN`, `NF-P`, `OPS`, `DOC`) and a sequence number (for example `THR-FN-001`). +* The Nine-Box tag definitions and usage guidelines are maintained in `AGENTS.md`. + +Alternatives Considered:: +* Simple sequential numbering (for example `REQ-001`, `DEC-001`):: +** Description: Use only numeric sequence numbers. +** Pros: Very easy to implement and explain. +** Cons: Conveys no information about type or domain. +* Custom project-specific categorisation scheme:: +** Description: Invent a bespoke set of categories for Chronicle Threads only. +** Pros: Could be tailored closely to the project. +** Cons: Extra effort to define and maintain; less transferable knowledge for staff working across multiple Chronicle projects. + +Rationale for Decision:: +* The Nine-Box taxonomy already provides a broadly applicable and documented set of categories. +* Using the same taxonomy across Chronicle projects improves consistency and makes it easier for people to move between modules. +* Including the tag in identifiers gives immediate context about what an item represents. + +Impact & Consequences:: +* Positive: +** Better traceability between requirements, decisions, tests, and code. +** Easier for team members to understand the role of an item from its ID. +** Supports better organisation and searching in documentation and tooling. +* Negative: +** Requires familiarity with the Nine-Box taxonomy and consistent application. +** Initial documentation and training overhead (captured in `AGENTS.md`). + +Notes/Links:: +* Nine-Box taxonomy details and guidance: `AGENTS.md`. +* Requirements catalogue: link:project-requirements.adoc[Functional requirements for Chronicle Threads.] + +[[THR-FN-005]] +=== THR-FN-005 EventGroup priority sets gate handler registration + +Date:: 2025-03-01 + +Context:: +* `EventGroup` coordinates multiple child loops and routes handlers based on their declared `HandlerPriority`. +* Not all deployments require every priority; some groups should expose a restricted set of priorities or be effectively disabled for application handlers. +* Earlier behaviour allowed handlers to be added even when the group had not been configured explicitly with a corresponding priority set, making it harder to reason about which handlers were admissible. + +Decision Statement:: +* `EventGroup` instances are configured with a set of supported handler priorities. +* Loops are only materialised for the priorities included in this set; other loops are omitted. +* Attempts to register a handler whose `HandlerPriority` has no corresponding loop (for example because that priority is not in the configured set) fail fast with an `IllegalStateException`. +* A group configured with an empty priority set retains only its monitor loop; attempts to add handlers with any other priority are rejected. + +Alternatives Considered:: +* Implicit support for all priorities:: +** Description: Treat all priorities as supported unless explicitly disabled, materialising loops lazily on first handler registration. +** Pros: Minimal configuration, potentially simpler for small deployments. +** Cons: Harder to predict resource usage and topology; accidental handler registration can silently create extra loops and threads. +* Soft failure for unsupported priorities:: +** Description: Drop or log handlers whose priorities are not supported instead of throwing. +** Pros: Avoids exceptions in misconfigured systems. +** Cons: Silent misconfiguration is dangerous; handlers may never run with little or no visibility. + +Rationale for Decision:: +* An explicit priority set makes `EventGroup` topology predictable and keeps resource allocation under operator control. +* Failing fast when a handler targets an unsupported priority surfaces configuration errors early in development and testing. +* Retaining only the monitor loop when the set is empty supports scenarios where an `EventGroup` instance is used purely for monitoring infrastructure rather than application handlers. + +Impact & Consequences:: +* Positive: +** Clearer reasoning about which handlers can attach to a given group. +** Better alignment between configuration (builder calls) and the underlying loop topology. +** Easier to construct priority-limited or monitor-only groups without surprising background loops. +* Negative: +** A stricter contract means some previously permissive configurations now fail fast and require explicit updates to their configured priority sets. +** Callers that expect to use priorities that are not enabled for a given group must update their configuration or routing. + +Notes/Links:: +* Requirements: link:project-requirements.adoc[THR-FN-004 and THR-FN-005 handler management requirements]. +* Key types: `EventGroup`, `EventGroupBuilder.withPriorities(java.util.Set)`, `HandlerPriority`. diff --git a/src/main/docs/functional-requirements.adoc b/src/main/docs/functional-requirements.adoc new file mode 100644 index 000000000..898f56881 --- /dev/null +++ b/src/main/docs/functional-requirements.adoc @@ -0,0 +1,73 @@ += Chronicle Threads - Functional Requirements Summary +:toc: +:lang: en-GB +:source-highlighter: rouge + +== Purpose + +This document provides a summary view of the functional requirements for Chronicle Threads. +It organises key `THR-FN-*` requirements from `src/main/docs/project-requirements.adoc` into domains and links them to tests, benchmarks and decision records. +The detailed specification, including examples and non-functional targets, remains in `project-requirements.adoc`. + +== Domain Overview + +[cols="1,3,3",options="header"] +|=== +| Tag range | Domain | Notes +| THR-FN-001 .. THR-FN-003 | Event loop lifecycle and configuration | Builders, start and close behaviour, non-restartable loops. +| THR-FN-004 .. THR-FN-008 | Handler management and execution contract | Dynamic registration, priorities, serial execution, self-deregistration. +| THR-FN-010 .. THR-FN-012 | Idle strategies and pausers | Built-in pausers, custom pausers, adaptive back-off. +| THR-FN-015, THR-FN-017 | Thread affinity and NUMA considerations | Binding event loops to cores and NUMA-aware configuration. +|=== + +== Event loop lifecycle and configuration (THR-FN-001 .. THR-FN-003) + +[cols="1,4,3",options="header"] +|=== +| ID | Requirement (summary) | Verification and references +| THR-FN-001 | Provide builder-style APIs (for example `EventLoopBuilder`, `EventGroupBuilder`) for fluent and immutable configuration of event loops and groups. | Covered by `EventGroupTest`, `EventLoopsTest` and other unit tests that construct groups via builders; end-to-end examples in `project-requirements.adoc`. +| THR-FN-002 | Support idempotent `start()` and graceful `close()` operations on `EventLoop` and `EventGroup`, ensuring active handlers complete and resources are released. | Verified by lifecycle tests such as `StopVCloseTest`, `EventGroupTest` and event-loop tests that start and close loops in different orders; usage patterns documented in the functional specification. +| THR-FN-003 | Reject or ignore attempts to restart an `EventLoop` or `EventGroup` after it has been closed, as loops are not restartable. | Behaviour exercised implicitly in lifecycle tests (for example `StopVCloseTest`) and stressed in long-running scenarios (for example `EventGroupStressTest`); notes in `project-requirements.adoc`. +|=== + +== Handler management and execution contract (THR-FN-004 .. THR-FN-008) + +[cols="1,4,3",options="header"] +|=== +| ID | Requirement (summary) | Verification and references +| THR-FN-004 | Allow clients to add `EventHandler` instances to a running `EventGroup` at runtime. | `EventGroupHandlerTest`, `EventGroupTest` and `EventGroupStressTest` register handlers after groups are started and confirm they are invoked; examples in the functional specification. +| THR-FN-005 | Require each `EventHandler` to declare a `HandlerPriority`, which influences execution order and loop assignment; `EventGroup` instances configure a set of supported priorities and reject handlers whose priority has no corresponding loop. | Verified indirectly via `EventGroupTest`, `MediumEventLoopTest` and `BlockingEventLoopTest`, which route handlers to different loops and assert expected scheduling behaviour; reference `HandlerPriority` Javadoc and `EventGroupBuilder.withPriorities(...)` documentation. +| THR-FN-006 | Invoke each handler's `action()` method serially on the dedicated event loop thread, except for priorities that use their own threading model (such as `BLOCKING` or `CONCURRENT`). | `VanillaEventLoopTest`, `MediumEventLoopTest`, `BlockingEventLoopTest` and `EventLoopConcurrencyStressTest` assume single-threaded handler execution and validate behaviour under contention; rationale captured in the decision log. +| THR-FN-007 | Require `action()` to return a boolean flag indicating whether useful work was performed, guiding the pauser and scheduling decisions. | `TestEventHandlers` and `PauserTest` exercise different sequences of `true` and `false` returns and observe how pausers respond. +| THR-FN-008 | Allow handlers to request self-deregistration by throwing `InvalidEventHandlerException` from `action()`. | `EventGroupHandlerTest` covers self-deregistration scenarios, confirming handlers are removed while the loop continues to run. +|=== + +== Idle strategies and pausers (THR-FN-010 .. THR-FN-012) + +[cols="1,4,3",options="header"] +|=== +| ID | Requirement (summary) | Verification and references +| THR-FN-010 | Provide a set of standard `Pauser` strategies selectable via `PauserMode` (for example `BUSY`, `TIMED_BUSY`, `YIELDING`, `BALANCED`, `MILLI`, `SLEEPY`). | `PauserTest`, `YieldingPauserTest`, `PauserTimeoutTest` and `LongPauserTest` instantiate and exercise built-in pausers; documentation in `README.adoc` and `project-requirements.adoc`. +| THR-FN-011 | Allow applications to supply custom `Pauser` implementations through configuration, such as builder methods on `EventGroupBuilder`. | Verified by tests that provide custom pausers to loops (for example in `PauserTest`) and confirm they integrate correctly with the loop lifecycle. +| THR-FN-012 | Provide adaptive pausers (such as `LongPauser`) that can be configured for different back-off strategies, including spin, yield and sleep phases. | `LongPauserTest`, `LongPauserBenchmark` and `PauserTimeoutTest` configure adaptive pausers with different parameters and measure behaviour; performance targets described in the functional specification. +|=== + +== Thread affinity and NUMA considerations (THR-FN-015, THR-FN-017) + +[cols="1,4,3",options="header"] +|=== +| ID | Requirement (summary) | Verification and references +| THR-FN-015 | Provide mechanisms to bind event loop threads to specific CPU cores using Chronicle Affinity. | `EventGroupBadAffinityTest` and `EventGroupTest` exercise affinity configuration and failure modes; documentation in `project-requirements.adoc` and Chronicle Affinity guides. +| THR-FN-017 | Allow configuration patterns that support NUMA-aware placement of event loops and handler workloads. | Documented via configuration examples in `project-requirements.adoc` and `architecture-overview.adoc`; NUMA-related behaviour is validated in higher-level integration environments. +|=== + +== Relationship to decision log + +* `src/main/docs/decision-log.adoc` captures the rationale behind key functional choices, including the single-threaded event loop model, pauser defaults and background monitors. +* When functional behaviour changes in ways that affect these requirements, update both this summary and the project-level specification, and record the rationale in the decision log. + +== Relationship to tests + +The test classes under `src/test/java/net/openhft/chronicle/threads` provide executable examples of the behaviours described above. +They are not yet annotated with requirement identifiers, but the tables in this document list representative tests for each requirement group so reviewers can trace behaviours to concrete test coverage. +Future work may add explicit THR identifiers to test names or comments to strengthen this traceability further. diff --git a/src/main/docs/thread-operational-controls.adoc b/src/main/docs/operational-controls.adoc similarity index 83% rename from src/main/docs/thread-operational-controls.adoc rename to src/main/docs/operational-controls.adoc index 9d8626b41..ef8d806a8 100644 --- a/src/main/docs/thread-operational-controls.adoc +++ b/src/main/docs/operational-controls.adoc @@ -2,6 +2,18 @@ :toc: :sectnums: :lang: en-GB +:source-highlighter: rouge + +== Relationship to Other Documents + +This guide focuses on operational practices for deploying and running Chronicle Threads in production environments. +It should be read alongside: + +* link:architecture-overview.adoc[architecture-overview.adoc] – runtime topology, event loop composition and pauser strategy. +* link:thread-performance-targets.adoc[thread-performance-targets.adoc] – performance targets and benchmarking methodology. +* link:thread-security-review.adoc[thread-security-review.adoc] – security considerations for handler admission, affinity and telemetry. +* link:thread-safety-guide.adoc[thread-safety-guide.adoc] – handler confinement and testing guidance. +* `systemProperties.adoc` – detailed system property descriptions used by the controls in this document. == CPU Isolation and Affinity Governance diff --git a/src/main/docs/project-requirements.adoc b/src/main/docs/project-requirements.adoc index 7d533e034..b537873e6 100644 --- a/src/main/docs/project-requirements.adoc +++ b/src/main/docs/project-requirements.adoc @@ -1,10 +1,10 @@ -= Chronicle Threads – Functional Requirements Specification += Chronicle Threads - Functional Requirements Specification +:toc: +:sectnums: +:lang: en-GB :revnumber: 1.0 :revdate: 2025-05-25 -:sectnums: -:toc: :source-highlighter: rouge -:lang: en-GB == Purpose and Scope @@ -26,7 +26,7 @@ _Out of scope_ are: |_EventHandler_ |Application-provided component that implements `action()`; executed by an `EventLoop`. |_EventGroup_ |A container that manages one or more `EventLoop`s, potentially including a dedicated monitor loop. |_Pauser_ |Strategy object that controls how an idle loop waits (e.g., busy-spin, yield, sleep). -|_Fast Thread_ |A thread—usually pinned to an isolated CPU core—running a latency-critical event loop. +|_Fast Thread_ |A thread-usually pinned to an isolated CPU core-running a latency-critical event loop. |_Hot Path_ / _Fast Path_ |The code execution path that is performance-critical and frequently executed, where low latency and minimal overhead are paramount. |_Loop-Block Monitor_ |Background handler that measures `EventHandler` run-time and logs outliers exceeding a configured threshold. |_NUMA_ |Non-Uniform Memory Access; a memory architecture where memory access time depends on the memory location relative to a processor. @@ -42,7 +42,7 @@ _Out of scope_ are: == High-Level Functional Requirements -=== Event‐Loop Lifecycle and Configuration +=== Event-Loop Lifecycle and Configuration . _Creation and Configuration_ _THR-FN-001_ The library SHALL provide builder APIs (e.g., `EventLoopBuilder`, `EventGroupBuilder`) for fluent and immutable configuration of event loops and groups. @@ -59,7 +59,7 @@ _THR-FN-004_ Clients SHALL be able to add an `EventHandler` to a running `EventG . _Handler Priority_ _THR-FN-005_ Each `EventHandler` SHALL declare a `HandlerPriority`. The system SHALL support a range of priorities influencing execution order and/or loop assignment. -(Refer to `net.openhft.chronicle.core.threads.HandlerPriority` enum for the exhaustive list of priorities, e.g., `HIGH`, `MEDIUM`, `LOW`, `TIMER`, `BLOCKING`, `REPLICATION`, `CONCURRENT`, `MONITOR`, `DAEMON`). +(Refer to `net.openhft.chronicle.core.threads.HandlerPriority` enum for the exhaustive list of priorities, e.g., `HIGH`, `MEDIUM`, `LOW`, `TIMER`, `BLOCKING`, `REPLICATION`, `CONCURRENT`, `MONITOR`, `DAEMON`). An `EventGroup` SHALL be configured with a set of supported priorities; loops are only created for enabled priorities and attempts to add a handler whose priority has no corresponding loop SHALL be rejected. . _Execution Contract_ _THR-FN-006_ An `EventHandler`'s `action()` method MUST be invoked serially on the `EventLoop`'s dedicated thread. _THR-FN-007_ The `action()` method MUST return a boolean: `true` if useful work was done (suggesting the handler may have more immediate work), `false` otherwise. @@ -142,10 +142,10 @@ _THR-NF-P-031_ *CPU Utilisation:* [cols="1,4,3",options="header"] |=== |ID |Requirement |Artefact(s) -|THR-DOC-032 |Publish an architecture overview that illustrates loop composition, handler routing, and affinity guidance, keeping it aligned with the functional catalogue. |link:thread-architecture-overview.adoc[thread-architecture-overview.adoc] -|THR-DOC-033 |Maintain an operational controls playbook detailing CPU isolation, monitoring thresholds, and configuration precedence so operators can enforce safe defaults. |link:thread-operational-controls.adoc[thread-operational-controls.adoc] +|THR-DOC-032 |Publish an architecture overview that illustrates loop composition, handler routing, and affinity guidance, keeping it aligned with the functional catalogue. |link:architecture-overview.adoc[architecture-overview.adoc] +|THR-DOC-033 |Maintain an operational controls playbook detailing CPU isolation, monitoring thresholds, and configuration precedence so operators can enforce safe defaults. |link:operational-controls.adoc[operational-controls.adoc] |THR-DOC-034 |Record security considerations for handler admission, affinity, telemetry integrity, and dependency posture, updating the review after material changes. |link:thread-security-review.adoc[thread-security-review.adoc] -|THR-DOC-035 |Provide a thread-safety guide that explains confinement rules, hand-off patterns, and testing practices for event loop handlers. |link:thread-thread-safety-guide.adoc[thread-thread-safety-guide.adoc] +|THR-DOC-035 |Provide a thread-safety guide that explains confinement rules, hand-off patterns, and testing practices for event loop handlers. |link:thread-safety-guide.adoc[thread-safety-guide.adoc] |THR-TEST-036 |Automate benchmark and soak tests that demonstrate compliance with latency, jitter, throughput, and allocation targets. |link:thread-performance-targets.adoc[thread-performance-targets.adoc] |=== @@ -174,11 +174,11 @@ During bursts, it processes events with low latency. * `net.openhft.chronicle.core.threads.HandlerPriority` Javadoc (Definitive list of handler priorities) * `net.openhft.chronicle.threads.PauserMode` Javadoc (Definitive list of pauser modes) * Chronicle Affinity library documentation (For details on CPU binding syntax and capabilities) -* `thread-architecture-overview.adoc` (Runtime topology) -* `thread-operational-controls.adoc` (Operational safeguards) +* `architecture-overview.adoc` (Runtime topology) +* `operational-controls.adoc` (Operational safeguards) * `thread-performance-targets.adoc` (Benchmark methodology) * `thread-security-review.adoc` (Security posture) -* `thread-thread-safety-guide.adoc` (Confinement practices) +* `thread-safety-guide.adoc` (Confinement practices) [[open-issues]] == Open Issues / Future Enhancements @@ -201,7 +201,7 @@ EventGroup eg = EventGroup.builder() .withName("MatchingEngineGroup") // Sets the base name for the EventGroup and its child loops .withLoopCount(2) // Example: might influence number of certain types of loops if applicable .withPauserMode(PauserMode.BUSY) // Sets default pauser for core loops - .withPriorities(EnumSet.of(HandlerPriority.HIGH, HandlerPriority.MEDIUM, HandlerPriority.MONITOR)) // Specify which handler priorities this group will support + .withPriorities(EnumSet.of(HandlerPriority.HIGH, HandlerPriority.MEDIUM, HandlerPriority.MONITOR)) // Specify which handler priorities this group will support; if configured with an empty set only the monitor loop is present and handlers with any other priority are rejected .build(); // Add application-specific handlers diff --git a/src/main/docs/thread-architecture-overview.adoc b/src/main/docs/thread-architecture-overview.adoc deleted file mode 100644 index 3d6b91f5e..000000000 --- a/src/main/docs/thread-architecture-overview.adoc +++ /dev/null @@ -1,95 +0,0 @@ -= Chronicle Threads Architecture Overview -:toc: -:sectnums: -:lang: en-GB - -== Purpose - -This guide explains how Chronicle Threads composes event loops, handlers, pausers, and monitoring into a cohesive runtime so that engineers can reason about placement, affinity, and operational behaviour. -It complements the functional catalogue in `project-requirements.adoc` and provides concrete design cues for solution architects. - -== Event Loop Topologies - -Chronicle Threads organises work into named `EventLoop` instances that the `EventGroup` manages (THR-FN-001). -Each loop is single-threaded for handler execution (THR-FN-006) and is categorised by handler priority (THR-FN-005). - -.... - EventGroup - | - +-- CoreLoop[HIGH|MEDIUM] ---> fast path handlers (trading logic, matching) - | - +-- BlockingPool[BLOCKING] --> dedicated threads for I/O or storage waits - | - +-- TimerLoop[TIMER] ------> scheduled maintenance and time-based work - | - +-- MonitorLoop[MONITOR] -> observes loop-block latency and pauser metrics -.... - -Handlers attach to the loop whose priority matches their declared `HandlerPriority`. -An `EventGroup` materialises blocking and monitor loops only when required. -Applications can deploy multiple `EventGroup` instances in the same JVM to isolate subsystems whilst sharing pauser implementations. - -== Handler Lifecycle and Serial Execution - -Handlers are added at runtime via `EventGroup.addHandler()` (THR-FN-004). -The loop invokes each handler serially, ensuring stateful logic can remain lock-free (THR-FN-006). -The handler signals its progress via the boolean return value of `action()` (THR-FN-007). -Self-removal uses `InvalidEventHandlerException` (THR-FN-008); the loop removes the handler, logs through the standard `Jvm` channel, and continues running (THR-NF-O-009). - -Handlers should bound their execution time so that monitor loops can flag outliers reliably (THR-NF-O-018). -Long-running work belongs on the `BLOCKING` priority where independent threads handle it. -When reconfiguring a live loop, call `EventLoop.addHandler()` on the owning loop thread or rely on the concurrency-safe wrappers provided by `EventGroup`. - -== Pauser Strategy and Scheduler Interaction - -Pausers implement the idle strategy for each loop and are configured via builders or per-loop overrides (THR-FN-010, THR-FN-011). -Adaptive pausers expose tuning parameters that balance busy-spin and sleeping phases (THR-FN-012) while exposing metrics for observability (THR-NF-O-013, THR-NF-O-021). - -* `BUSY` / `TIMED_BUSY`: Bind to isolated cores, targeting nanosecond wake-up latency (THR-DOC-016). -* `BALANCED` / `SLEEPY`: Combine spin, yield, and park for mixed workloads. -* Custom: Provide a bespoke `Pauser` for domain-specific throttling. - -Hot paths avoid allocations (THR-NF-P-014) so a pauser change cannot introduce garbage. -Each loop records the time spent paused, supporting utilisation diagnostics. - -== Affinity and NUMA Alignment - -Affinity strings supplied via builders control how loops bind to hardware threads (THR-FN-015). -They accept the Chronicle Affinity syntax, including NUMA-aware layouts (THR-FN-017). -Example: - ----- -EventGroup eg = EventGroup.builder() - .withName("risk-eg") - .withBinding("0,2-3") - .build(); ----- - -* `0` binds the primary high-priority loop to core 0. -* `2-3` pins additional loops (e.g., MONITOR or BLOCKING) across cores 2 and 3. - -When multiple `EventGroup` instances coexist, coordinate bindings to avoid core contention. -Document selected affinities alongside deployment manifests so operators can validate CPU isolation. - -== Monitoring Plane - -Each `EventGroup` provisions a monitor loop that samples execution times and resets pausers at configurable intervals (THR-NF-O-018, THR-NF-O-019, THR-OPS-020). -The monitor loop: - -* Measures handler invocation duration, logging stack traces for breaches. -* Publishes pauser metrics through configured `PauserMonitorFactory` hooks. -* Responds to system properties that disable or tune monitoring (THR-OPS-023). - -The monitoring loop is not latency-critical but must keep pace with the core loops to avoid stale diagnostics. -Ensure JVM logging levels capture WARN messages from monitor handlers in production. - -== Integration Touchpoints - -Chronicle Threads commonly underpins Chronicle Queue tailers, Chronicle Map maintenance tasks, and application-specific pipelines. -When integrating: - -* Use `net.openhft.chronicle.core.io.Closeable` semantics to align handler lifecycle with queue appenders or tailers. -* Combine telemetry exports with the monitor loop to funnel utilisation metrics to the estate-wide monitoring system. -* Align handler priorities with data criticality so that core loops handle order flow while auxiliary loops manage persistence, replay, or housekeeping. - -Refer to `README.adoc` for code-level examples and to the operational controls document for deployment-time safeguards. diff --git a/src/main/docs/thread-performance-targets.adoc b/src/main/docs/thread-performance-targets.adoc index 4d823f05d..44f776f48 100644 --- a/src/main/docs/thread-performance-targets.adoc +++ b/src/main/docs/thread-performance-targets.adoc @@ -2,6 +2,7 @@ :toc: :sectnums: :lang: en-GB +:source-highlighter: rouge == Scope @@ -25,8 +26,8 @@ Variations :: [cols="2,3,3",options="header"] |=== |Requirement |Target |Measurement Notes -|THR-NF-P-027 (Latency) |<= 10 microseconds at 99.99 percentile for single-hop handler runs |Profiling harness schedules 10 million iterations with a busy pauser and isolated core. -|THR-NF-P-028 (Jitter) |<= 2 microseconds peak-to-peak jitter under steady load |Continuous histogram per handler, sampled via monitor loop over 15 minute windows. +|THR-NF-P-027 (Latency) |<= 10 µs at 99.99 percentile for single-hop handler runs |Profiling harness schedules 10 million iterations with a busy pauser and isolated core. +|THR-NF-P-028 (Jitter) |<= 2 µs peak-to-peak jitter under steady load |Continuous histogram per handler, sampled via monitor loop over 15 minute windows. |THR-NF-P-029 (Throughput) |>= 5 million 64-byte events per second on a fast loop |Benchmark harness dispatches fixed-size payloads, recording sustained processing rate. |THR-NF-P-030 (Heap Allocation) |<= 0.1 Bytes per event averaged across handlers |Java Flight Recorder or allocation profiler attached during workload replay. |THR-NF-P-014 (Pauser Hot Path) |0 allocations in `Pauser.pause()` / `reset()` |Unit tests instrumented with allocation counters; CI gate fails on non-zero heap activity. diff --git a/src/main/docs/thread-thread-safety-guide.adoc b/src/main/docs/thread-safety-guide.adoc similarity index 88% rename from src/main/docs/thread-thread-safety-guide.adoc rename to src/main/docs/thread-safety-guide.adoc index 4aae0f226..5f83ce716 100644 --- a/src/main/docs/thread-thread-safety-guide.adoc +++ b/src/main/docs/thread-safety-guide.adoc @@ -2,12 +2,19 @@ :toc: :sectnums: :lang: en-GB +:source-highlighter: rouge == Scope This guide explains how Chronicle Threads enforces single-threaded handler execution and how developers should structure code that interacts with event loops. It expands on requirements THR-FN-006 through THR-NF-O-009 and aligns with Chronicle Core's `SingleThreadedChecked` utilities. +Read this guide together with: + +* link:architecture-overview.adoc[architecture-overview.adoc] – for loop topology and handler placement. +* link:thread-performance-targets.adoc[thread-performance-targets.adoc] – for performance expectations that influence safe handler design. +* link:operational-controls.adoc[operational-controls.adoc] – for operational safeguards that rely on thread-safety guarantees. + == Event Loop Ownership Model * Each `EventLoop` runs on a dedicated Java platform thread; handlers registered on that loop must not share mutable state with other threads without explicit synchronisation (THR-FN-006). diff --git a/src/main/docs/thread-security-review.adoc b/src/main/docs/thread-security-review.adoc index d24a53a40..28629ad1f 100644 --- a/src/main/docs/thread-security-review.adoc +++ b/src/main/docs/thread-security-review.adoc @@ -2,6 +2,35 @@ :toc: :sectnums: :lang: en-GB +:source-highlighter: rouge + +This document summarises security considerations for Chronicle Threads. +It is aligned with the shared standards in `Chronicle-Quality-Rules/src/main/docs/security-review.adoc` and `Chronicle-Quality-Rules/src/main/docs/architectural-standards.adoc`. + +== Trust Zone and Responsibilities + +Chronicle Threads is a _Foundation (Zone C)_ module in the Chronicle stack, as defined in `Chronicle-Quality-Rules/src/main/docs/architectural-standards.adoc`. +It provides scheduling, pauser and monitoring infrastructure that higher-level modules such as Chronicle Queue, Chronicle Network and Chronicle Services depend on. +Within this trust zone Chronicle Threads focuses on: + +* predictable execution and resource usage for event loops and pausers; +* clear boundaries around what it does *not* do (for example networking, cryptography and business-level access control); +* avoiding behaviours that would undermine the security guarantees of modules built on top of it. + +The remainder of this document expands on these responsibilities and their interaction with ISO 27001 themes. + +== Security Model Overview + +Chronicle Threads is a foundational library that orchestrates threads, event loops, pausers and monitoring. +It does not accept network connections, parse untrusted payloads directly, or perform cryptographic operations; instead it hosts application-provided `EventHandler` code and integrates with other Chronicle modules. + +Threat boundaries therefore sit at: + +* Handler admission and configuration (which handlers are installed, with what privileges). +* System properties and configuration files that influence pausers, monitoring and affinity. +* Dependencies such as Chronicle Core and the Affinity library that perform low-level operations on behalf of Threads. + +The following sections describe risks and mitigations at these boundaries. == Handler Admission and Privilege Escalation @@ -96,3 +125,21 @@ Mitigations :: Review hot-spots :: * Custom forks of Chronicle libraries. * Environments that block outbound network access, delaying vulnerability scanning updates. + +== Summary Against ISO 27001 Themes + +Secure coding and bounds checking (A.8.28) :: +* Chronicle Threads itself does not parse external payloads; input validation and bounds checking for business data reside in callers such as Chronicle Queue or application handlers. +* Within the module, system properties that influence behaviour are validated for type and constrained to documented ranges where applicable (for example thresholds and timeouts). + +Access control and privileged operations (A.8.3) :: +* Chronicle Threads does not implement authentication or authorisation; it runs with the privileges of the hosting JVM. +* Operational controls should ensure only trusted code can register handlers or modify JVM arguments, and that handler classes are reviewed under least-privilege principles. + +Cryptographic and network controls (A.8.22, A.8.24) :: +* The module does not implement cryptography or manage keys directly, nor does it open network sockets. +* When Chronicle Threads hosts handlers that use TLS or network stacks from other libraries, those components’ security guides and configuration must be followed. + +Vulnerability management (A.8.8) :: +* Dependency versions are controlled via Chronicle BOMs; vulnerability scanning and CVE tracking are handled at the organisation level across all modules. +* Teams using Chronicle Threads should ensure their BOM is kept current and that central security advisories are applied. diff --git a/src/main/java/net/openhft/chronicle/threads/AbstractLifecycleEventLoop.java b/src/main/java/net/openhft/chronicle/threads/AbstractLifecycleEventLoop.java index e4eaed7c5..51da8f2fc 100644 --- a/src/main/java/net/openhft/chronicle/threads/AbstractLifecycleEventLoop.java +++ b/src/main/java/net/openhft/chronicle/threads/AbstractLifecycleEventLoop.java @@ -41,7 +41,7 @@ public abstract class AbstractLifecycleEventLoop extends AbstractCloseable imple private static final long AWAIT_TERMINATION_TIMEOUT_MS = TimeUnit.MINUTES.toMillis(5); private final AtomicReference lifecycle = new AtomicReference<>(EventLoopLifecycle.NEW); protected final String name; - boolean privateGroup; + volatile boolean privateGroup; /** * Create an instance with the supplied name. diff --git a/src/main/java/net/openhft/chronicle/threads/BlockingEventLoop.java b/src/main/java/net/openhft/chronicle/threads/BlockingEventLoop.java index 021b13dd7..36a291d71 100644 --- a/src/main/java/net/openhft/chronicle/threads/BlockingEventLoop.java +++ b/src/main/java/net/openhft/chronicle/threads/BlockingEventLoop.java @@ -94,7 +94,7 @@ private void startHandler(final EventHandler handler) { try { final Runner runner = new Runner(handler, pauserSupplier.get()); runners.add(runner); - service.submit(runner); + service.execute(runner); } catch (RejectedExecutionException e) { if (!service.isShutdown()) @@ -147,6 +147,7 @@ public String toString() { '}'; } + @SuppressWarnings("ForLoopReplaceableByForEach") @Override public boolean isRunningOnThread(Thread thread) { for (int i=0; i < runners.size(); i++) { diff --git a/src/main/java/net/openhft/chronicle/threads/BusyPauser.java b/src/main/java/net/openhft/chronicle/threads/BusyPauser.java index 000d2fbf1..f021007fa 100644 --- a/src/main/java/net/openhft/chronicle/threads/BusyPauser.java +++ b/src/main/java/net/openhft/chronicle/threads/BusyPauser.java @@ -41,10 +41,9 @@ public void pause() { * * @param timeout timeout duration (ignored) * @param timeUnit unit of the timeout (ignored) - * @throws TimeoutException never thrown */ @Override - public void pause(long timeout, TimeUnit timeUnit) throws TimeoutException { + public void pause(long timeout, TimeUnit timeUnit) { throw new UnsupportedOperationException(this + " is not stateful, use a " + BusyTimedPauser.class.getSimpleName()); } diff --git a/src/main/java/net/openhft/chronicle/threads/DiskSpaceMonitor.java b/src/main/java/net/openhft/chronicle/threads/DiskSpaceMonitor.java index 8d9d698fe..6de18bea9 100644 --- a/src/main/java/net/openhft/chronicle/threads/DiskSpaceMonitor.java +++ b/src/main/java/net/openhft/chronicle/threads/DiskSpaceMonitor.java @@ -19,6 +19,7 @@ import java.util.concurrent.ConcurrentHashMap; import java.util.concurrent.ScheduledExecutorService; import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicInteger; /** * Monitors free space on the disks used by this JVM. @@ -41,7 +42,7 @@ *

The {@link #run()} loop iterates over the tracked {@link DiskAttributes} * entries. Each record stores a {@link FileStore}, the time for the next check * and the total size. When the free space is less than two hundred megabytes a - * panic notification is sent. Otherwise the next check is delayed based on the + * panic notification is sent. Otherwise, the next check is delayed based on the * amount of free space.

*/ public enum DiskSpaceMonitor implements Runnable, Closeable { @@ -55,7 +56,8 @@ public enum DiskSpaceMonitor implements Runnable, Closeable { final Map fileStoreCacheMap = new ConcurrentHashMap<>(); final Map diskAttributesMap = new ConcurrentHashMap<>(); final ScheduledExecutorService executor; - private int thresholdPercentage = Jvm.getInteger("chronicle.disk.monitor.threshold.percent", 5); + private final AtomicInteger thresholdPercentage = new AtomicInteger( + Jvm.getInteger("chronicle.disk.monitor.threshold.percent", 5)); private TimeProvider timeProvider = SystemTimeProvider.INSTANCE; DiskSpaceMonitor() { @@ -105,7 +107,7 @@ public void pollDiskSpace(File file) { return; } } - DiskAttributes da = diskAttributesMap.computeIfAbsent(fs, DiskAttributes::new); + diskAttributesMap.computeIfAbsent(fs, DiskAttributes::new); final long tookUs = (timeProvider.currentTimeNanos() - start) / 1_000; if (tookUs > TIME_TAKEN_WARN_THRESHOLD_US) @@ -127,13 +129,14 @@ public void run() { } public int getThresholdPercentage() { - return thresholdPercentage; + return thresholdPercentage.get(); } public void setThresholdPercentage(int thresholdPercentage) { - this.thresholdPercentage = thresholdPercentage; + this.thresholdPercentage.set(thresholdPercentage); } + @SuppressWarnings("ProtectedMemberInFinalClass") @VisibleForTesting protected void setTimeProvider(TimeProvider timeProvider) { this.timeProvider = timeProvider; @@ -170,8 +173,9 @@ void run() throws IOException { // if less than 200 Megabytes notifyDiskLow.panic(fileStore); - } else if (unallocatedBytes < totalSpace * DiskSpaceMonitor.INSTANCE.thresholdPercentage / 100) { - final double diskSpaceFull = ((long) (1000d * (totalSpace - unallocatedBytes) / totalSpace + 0.999)) / 10.0; + } else if (unallocatedBytes < totalSpace * DiskSpaceMonitor.INSTANCE.thresholdPercentage.get() / 100) { + final double usedFraction = (double) (totalSpace - unallocatedBytes) / totalSpace; + final double diskSpaceFull = Math.ceil(usedFraction * 1000d) / 10d; notifyDiskLow.warning(diskSpaceFull, fileStore); } else { @@ -179,8 +183,11 @@ void run() throws IOException { timeNextCheckedMS = now + (unallocatedBytes >> 20); } long time = System.nanoTime() - start; - if (time > 1_000_000) - Jvm.perf().on(getClass(), "Took " + time / 10_000 / 100.0 + " ms to check the disk space of " + fileStore); + if (time > 1_000_000) { + long hundredths = time / 10_000; + double millis = hundredths / 100.0; + Jvm.perf().on(getClass(), "Took " + millis + " ms to check the disk space of " + fileStore); + } } } diff --git a/src/main/java/net/openhft/chronicle/threads/EventGroup.java b/src/main/java/net/openhft/chronicle/threads/EventGroup.java index 444cf0b48..924c534f7 100644 --- a/src/main/java/net/openhft/chronicle/threads/EventGroup.java +++ b/src/main/java/net/openhft/chronicle/threads/EventGroup.java @@ -54,9 +54,7 @@ * eg.start(); * */ -public class EventGroup - extends AbstractLifecycleEventLoop - implements EventLoop { +public class EventGroup extends AbstractLifecycleEventLoop implements EventLoop { public static final int CONC_THREADS = Jvm.getInteger("eventGroup.conc.threads", Jvm.getInteger("CONC_THREADS", Math.max(1, Runtime.getRuntime().availableProcessors() / 4))); @@ -331,7 +329,19 @@ private void waitToStart(EventLoop waitfor) { EventLoopStateRenderer.INSTANCE.render("Core", core), EventLoopStateRenderer.INSTANCE.render("Monitor", monitor), threadDump)); - throw Jvm.rethrow(e); + String coreState = core == null + ? "Core loop not configured" + : EventLoopStateRenderer.INSTANCE.render("Core", core); + String monitorState = EventLoopStateRenderer.INSTANCE.render("Monitor", monitor); + String message = format("Timed out waiting %,dms for %s to start%n%s%n%n%s%n%n%s", + waitTime, + waitfor.name(), + coreState, + monitorState, + renderThreadDump()); + TimeoutException te = new TimeoutException(message); + te.initCause(e); + throw Jvm.rethrow(te); } } } @@ -388,7 +398,7 @@ protected void performClose() { @Override public boolean runsInsideCoreLoop() { - return core.runsInsideCoreLoop(); + return core != null && core.runsInsideCoreLoop(); } @Override diff --git a/src/main/java/net/openhft/chronicle/threads/EventGroupBuilder.java b/src/main/java/net/openhft/chronicle/threads/EventGroupBuilder.java index 5a57ae67b..2d5aafd4e 100644 --- a/src/main/java/net/openhft/chronicle/threads/EventGroupBuilder.java +++ b/src/main/java/net/openhft/chronicle/threads/EventGroupBuilder.java @@ -73,7 +73,7 @@ private EventGroupBuilder() { */ @SuppressWarnings("deprecation") @Override - public EventGroup build() { + public @NotNull EventGroup build() { EventGroup eventGroup = new EventGroup(daemon, pauserOrDefault(), replicationPauser, @@ -83,7 +83,7 @@ public EventGroup build() { concurrentThreadsNum, defaultBinding(concurrentBinding), concurrentPauserSupplier, - priorities, + EnumSet.copyOf(priorities), blockingPauserSupplier); eventGroup.privateGroup(privateGroup); return eventGroup; @@ -245,12 +245,14 @@ public EventGroupBuilder withConcurrentPauserSupplier(@NotNull Supplier /** * Chooses which handler priorities the group will support. Loops for * priorities not included are not created. The default is all priorities. + * The behaviour when the set is empty is to create a group that retains + * only its monitor loop; attempts to add handlers with any other priority will fail. * * @param priorities set of priorities to enable * @return this builder */ public EventGroupBuilder withPriorities(Set priorities) { - this.priorities = priorities; + this.priorities = EnumSet.copyOf(priorities); return this; } @@ -258,7 +260,7 @@ public EventGroupBuilder withPriorities(Set priorities) { * Convenience overload to build a priority set from the given arguments. * * @param firstPriority first priority in the set - * @param priorities remaining priorities + * @param priorities remaining priorities * @return this builder */ public EventGroupBuilder withPriorities(HandlerPriority firstPriority, HandlerPriority... priorities) { diff --git a/src/main/java/net/openhft/chronicle/threads/LongPauser.java b/src/main/java/net/openhft/chronicle/threads/LongPauser.java index bd4f6a713..b1198f4e2 100644 --- a/src/main/java/net/openhft/chronicle/threads/LongPauser.java +++ b/src/main/java/net/openhft/chronicle/threads/LongPauser.java @@ -80,6 +80,7 @@ public void pause() { try { pause(Long.MAX_VALUE, TimeUnit.SECONDS); } catch (TimeoutException ignored) { + // ignore - effectively infinite timeout should not expire } } @@ -181,14 +182,21 @@ private void yield() { * @param delayNs pause duration in nanoseconds */ void doPause(long delayNs) { - long start = System.nanoTime(); - thread = Thread.currentThread(); + final Thread threadSnapshot = Thread.currentThread(); + thread = threadSnapshot; pausing.set(true); - if (!thread.isInterrupted()) + long elapsed = 0; + try { + if (!threadSnapshot.isInterrupted()) { + final long start = System.nanoTime(); LockSupport.parkNanos(delayNs); + elapsed = System.nanoTime() - start; + } + } finally { pausing.set(false); - long time = System.nanoTime() - start; - timePaused += time; + thread = null; + } + timePaused += elapsed; } @Override diff --git a/src/main/java/net/openhft/chronicle/threads/MediumEventLoop.java b/src/main/java/net/openhft/chronicle/threads/MediumEventLoop.java index 03c57a958..a33f7d669 100644 --- a/src/main/java/net/openhft/chronicle/threads/MediumEventLoop.java +++ b/src/main/java/net/openhft/chronicle/threads/MediumEventLoop.java @@ -5,7 +5,6 @@ import net.openhft.affinity.AffinityLock; import net.openhft.chronicle.core.Jvm; -import net.openhft.chronicle.core.annotation.HotMethod; import net.openhft.chronicle.core.io.AbstractCloseable; import net.openhft.chronicle.core.io.Closeable; import net.openhft.chronicle.core.io.ClosedIllegalStateException; @@ -47,9 +46,9 @@ public class MediumEventLoop extends AbstractLifecycleEventLoop implements CoreE private final transient Object startStopMutex = new Object(); @Nullable - protected transient final EventLoop parent; + protected final transient EventLoop parent; @NotNull - protected transient final ExecutorService service; + protected final transient ExecutorService service; protected final List mediumHandlers = new CopyOnWriteArrayList<>(); protected final ConcurrentLinkedQueue newHandlers = new ConcurrentLinkedQueue<>(); protected final Pauser pauser; @@ -245,7 +244,6 @@ public long loopStartNS() { } @Override - @HotMethod @SuppressWarnings("try") public void run() { try { @@ -253,8 +251,6 @@ public void run() { // Make sure nobody's adding a handler while we do this synchronized (addHandlerMutex) { thread = Thread.currentThread(); - if (thread == null) - throw new NullPointerException(); loopStartedAllHandlers(); } runLoop(); @@ -267,6 +263,7 @@ public void run() { } finally { loopFinishedAllHandlers(); loopStartNS = NOT_IN_A_LOOP; + thread = null; } } catch (Throwable e) { Jvm.warn().on(getClass(), hasBeen("terminated due to exception"), e); @@ -369,7 +366,7 @@ private void closeAll() { } // Unrolled to avoid megamorphic call chains. - @SuppressWarnings("fallthrough") + @SuppressWarnings({"fallthrough", "DefaultNotLastCaseInSwitch", "java:S4524", "java:S1141"}) private boolean runAllMediumHandler() { boolean busy = false; final EventHandler[] handlers = this.mediumHandlersArray; @@ -425,7 +422,7 @@ private boolean runAllMediumHandler() { } // Unrolled to reduce megamorphic calls and keep the JIT hot. - @SuppressWarnings("fallthrough") + @SuppressWarnings({"fallthrough", "DefaultNotLastCaseInSwitch"}) protected boolean runAllHandlers() { boolean busy = false; final EventHandler[] handlers = this.mediumHandlersArray; @@ -434,6 +431,7 @@ protected boolean runAllHandlers() { busy |= callHighHandler(); switch (handlers.length) { + //noinspection DefaultNotLastCaseInSwitch default: for (int i = handlers.length - 1; i >= 4; i--) { busy |= callHighHandler(); @@ -534,7 +532,6 @@ protected void updateMediumHandlersArray() { this.mediumHandlersArray = mediumHandlers.toArray(NO_EVENT_HANDLERS); } - @HotMethod private boolean acceptNewHandlers() { boolean result = false; EventHandler handler; @@ -622,7 +619,9 @@ public void dumpRunningState(@NotNull final String message, @NotNull final Boole // Better to log that a blockage was found (and that the user has paid for a slow getStackTrace()) final long timeToTakeStackTraceMillis = (System.nanoTime() - startTimeNanos) / 1_000_000; out.setLength(messageIndex); - out.append(" An accurate stack trace could not be determined (capturing the stack trace took " + timeToTakeStackTraceMillis + "ms)"); + out.append(" An accurate stack trace could not be determined (capturing the stack trace took ") + .append(timeToTakeStackTraceMillis) + .append("ms)"); } Jvm.perf().on(getClass(), out.toString()); } @@ -680,20 +679,21 @@ protected void performClose() { } private void shutdownService() { - LockSupport.unpark(thread); + Thread threadSnapshot = thread; + LockSupport.unpark(threadSnapshot); if (privateGroup) { service.shutdownNow(); return; } Threads.shutdown(service, daemon); - if (thread != null && thread != Thread.currentThread()) { + if (threadSnapshot != null && threadSnapshot != Thread.currentThread()) { long startTimeMillis = System.currentTimeMillis(); long waitUntilMs = startTimeMillis; - thread.interrupt(); + threadSnapshot.interrupt(); for (int i = 1; i <= 50; i++) { - if (!thread.isAlive()) + if (!threadSnapshot.isAlive()) break; // we do this loop below to protect from Jvm.pause not pausing for as long as it should waitUntilMs += i; @@ -704,9 +704,9 @@ private void shutdownService() { final StringBuilder sb = new StringBuilder(); long ms = System.currentTimeMillis() - startTimeMillis; sb.append(name).append(": Shutting down thread is executing after "). - append(ms).append("ms ").append(thread) + append(ms).append("ms ").append(threadSnapshot) .append(", " + "handlerCount=").append(nonDaemonHandlerCount()); - Jvm.trimStackTrace(sb, thread.getStackTrace()); + Jvm.trimStackTrace(sb, threadSnapshot.getStackTrace()); Jvm.warn().on(getClass(), sb.toString()); dumpRunningHandlers(); } diff --git a/src/main/java/net/openhft/chronicle/threads/MilliPauser.java b/src/main/java/net/openhft/chronicle/threads/MilliPauser.java index d10b19db3..3540f5fa5 100644 --- a/src/main/java/net/openhft/chronicle/threads/MilliPauser.java +++ b/src/main/java/net/openhft/chronicle/threads/MilliPauser.java @@ -107,10 +107,9 @@ public boolean asyncPausing() { * * @param timeout the maximum time to pause in the specified {@code timeUnit} * @param timeUnit the unit of time for {@code timeout} - * @throws TimeoutException if the pause operation is not completed within the specified timeout */ @Override - public void pause(long timeout, @NotNull TimeUnit timeUnit) throws TimeoutException { + public void pause(long timeout, @NotNull TimeUnit timeUnit) { doPauseMS(timeUnit.toMillis(timeout)); } @@ -121,14 +120,21 @@ public void pause(long timeout, @NotNull TimeUnit timeUnit) throws TimeoutExcept * @param delayMS delay in milliseconds */ void doPauseMS(long delayMS) { - long start = System.nanoTime(); - thread = Thread.currentThread(); + final Thread threadSnapshot = Thread.currentThread(); + thread = threadSnapshot; pausing.set(true); - if (!thread.isInterrupted()) + long elapsed = 0; + try { + if (!threadSnapshot.isInterrupted()) { + final long start = System.nanoTime(); LockSupport.parkNanos(delayMS * 1_000_000L); + elapsed = System.nanoTime() - start; + } + } finally { pausing.set(false); - long time = System.nanoTime() - start; - timePaused += time; + thread = null; + } + timePaused += elapsed; countPaused++; } diff --git a/src/main/java/net/openhft/chronicle/threads/MonitorEventLoop.java b/src/main/java/net/openhft/chronicle/threads/MonitorEventLoop.java index 71f52d964..264e553b0 100644 --- a/src/main/java/net/openhft/chronicle/threads/MonitorEventLoop.java +++ b/src/main/java/net/openhft/chronicle/threads/MonitorEventLoop.java @@ -4,7 +4,6 @@ package net.openhft.chronicle.threads; import net.openhft.chronicle.core.Jvm; -import net.openhft.chronicle.core.annotation.HotMethod; import net.openhft.chronicle.core.io.Closeable; import net.openhft.chronicle.core.io.SimpleCloseable; import net.openhft.chronicle.core.threads.EventHandler; @@ -18,7 +17,10 @@ import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; -import static net.openhft.chronicle.threads.Threads.*; +import static net.openhft.chronicle.threads.Threads.eventLoopQuietly; +import static net.openhft.chronicle.threads.Threads.loopFinishedQuietly; +import static net.openhft.chronicle.threads.Threads.loopStartedCall; +import static net.openhft.chronicle.threads.Threads.shutdownDaemon; /** * Event loop dedicated to low-frequency monitoring tasks. Handlers added to this loop are @@ -33,8 +35,8 @@ public class MonitorEventLoop extends AbstractLifecycleEventLoop implements Runn public static final String MONITOR_INITIAL_DELAY = "MonitorInitialDelay"; static int MONITOR_INITIAL_DELAY_MS = Jvm.getInteger(MONITOR_INITIAL_DELAY, 10_000); - private transient final ExecutorService service; - private transient final EventLoop parent; + private final transient ExecutorService service; + private final transient EventLoop parent; private final List handlers = new CopyOnWriteArrayList<>(); private final Pauser pauser; private transient volatile Thread thread = null; @@ -53,7 +55,7 @@ public MonitorEventLoop(final EventLoop parent, final String name, final Pauser @Override protected void performStart() { - service.submit(this); + service.execute(this); } @Override @@ -73,7 +75,7 @@ protected void performStopFromStarted() { private void performStop() { unpause(); - Threads.shutdownDaemon(service); + shutdownDaemon(service); } @Override @@ -92,7 +94,7 @@ public boolean isAlive() { public synchronized void addHandler(@NotNull final EventHandler handler) { throwExceptionIfClosed(); - if (DEBUG_ADDING_HANDLERS) + if (EventLoop.DEBUG_ADDING_HANDLERS) Jvm.debug().on(getClass(), "Adding " + handler.priority() + " " + handler + " to " + this.name); if (isClosed()) throw new IllegalStateException("Event Group has been closed"); @@ -102,7 +104,6 @@ public synchronized void addHandler(@NotNull final EventHandler handler) { } @Override - @HotMethod public void run() { throwExceptionIfClosed(); @@ -126,10 +127,10 @@ public void run() { synchronized (this) { handlers.forEach(Threads::loopFinishedQuietly); } + thread = null; } } - @HotMethod private boolean runHandlers() { boolean busy = false; for (int i = 0; i < handlers.size(); i++) { @@ -155,7 +156,7 @@ private synchronized void removeHandler(int handlerIndex) { EventHandler removedHandler = handlers.remove(handlerIndex); loopFinishedQuietly(removedHandler); Closeable.closeQuietly(removedHandler); - if (DEBUG_REMOVING_HANDLERS) + if (EventLoop.DEBUG_REMOVING_HANDLERS) Jvm.debug().on(getClass(), "Removing " + removedHandler.priority() + " " + removedHandler + " from " + this.name); } catch (ArrayIndexOutOfBoundsException e) { if (!handlers.isEmpty()) { @@ -184,7 +185,7 @@ public boolean isRunningOnThread(Thread thread) { */ private static final class IdempotentLoopStartedEventHandler extends SimpleCloseable implements EventHandler { - private transient final EventHandler eventHandler; + private final transient EventHandler eventHandler; private final String handler; private boolean loopStarted = false; diff --git a/src/main/java/net/openhft/chronicle/threads/Pauser.java b/src/main/java/net/openhft/chronicle/threads/Pauser.java index ea8f606a5..e0e710c7b 100644 --- a/src/main/java/net/openhft/chronicle/threads/Pauser.java +++ b/src/main/java/net/openhft/chronicle/threads/Pauser.java @@ -266,6 +266,7 @@ enum SleepyWarning { } } + @SuppressWarnings("EmptyMethod") static void warnSleepy() { // Do nothing here as run-once code is in the static block above. } diff --git a/src/main/java/net/openhft/chronicle/threads/ThreadHolder.java b/src/main/java/net/openhft/chronicle/threads/ThreadHolder.java index c0e0b4ae1..aeae237de 100644 --- a/src/main/java/net/openhft/chronicle/threads/ThreadHolder.java +++ b/src/main/java/net/openhft/chronicle/threads/ThreadHolder.java @@ -18,9 +18,8 @@ public interface ThreadHolder { * Indicates whether the monitored thread is still running. * * @return {@code true} if the thread has not terminated - * @throws InvalidEventHandlerException if the holder can no longer be queried */ - boolean isAlive() throws InvalidEventHandlerException; + boolean isAlive(); /** * Called once the thread has ended so monitoring can be stopped or logged. diff --git a/src/main/java/net/openhft/chronicle/threads/ThreadMonitors.java b/src/main/java/net/openhft/chronicle/threads/ThreadMonitors.java index c620c6c68..0ac1cb787 100644 --- a/src/main/java/net/openhft/chronicle/threads/ThreadMonitors.java +++ b/src/main/java/net/openhft/chronicle/threads/ThreadMonitors.java @@ -13,6 +13,14 @@ import java.util.function.LongSupplier; import java.util.function.Supplier; +/** + * Factory for {@link ThreadMonitor} instances used to watch long-running threads. + *

+ * The helpers build monitors around a supplied {@link Thread}, time source and threshold, + * and arrange for stack traces to be logged when the thread appears blocked or stalled + * for longer than the configured limit. Separate variants are provided for generic and + * service threads, with optional call sites for custom logging behaviour. + */ public enum ThreadMonitors { ; // none diff --git a/src/main/java/net/openhft/chronicle/threads/Threads.java b/src/main/java/net/openhft/chronicle/threads/Threads.java index 83b66909f..d6020d925 100644 --- a/src/main/java/net/openhft/chronicle/threads/Threads.java +++ b/src/main/java/net/openhft/chronicle/threads/Threads.java @@ -237,7 +237,7 @@ static void forEachThread(ExecutorService service, Consumer consumer) { for (Object o : objects) { Thread t = Jvm.getValue(o, "thread"); - if (t.getState() != State.TERMINATED) + if (t != null && t.getState() != State.TERMINATED) consumer.accept(t); } } catch (Exception e) { diff --git a/src/main/java/net/openhft/chronicle/threads/VanillaEventLoop.java b/src/main/java/net/openhft/chronicle/threads/VanillaEventLoop.java index 2ffc5db6e..a00db5561 100644 --- a/src/main/java/net/openhft/chronicle/threads/VanillaEventLoop.java +++ b/src/main/java/net/openhft/chronicle/threads/VanillaEventLoop.java @@ -65,11 +65,6 @@ public VanillaEventLoop(@Nullable final EventLoop parent, this.priorities = EnumSet.copyOf(priorities); } - public static void closeAll(@NotNull final List handlers) { - // do not remove the handler here, remove all at end instead - Closeable.closeQuietly(handlers); - } - private static void clearUsedByThread(@NotNull EventHandler handler) { if (handler instanceof AbstractCloseable) ((AbstractCloseable) handler).singleThreadedCheckReset(); @@ -113,10 +108,8 @@ protected void loopStartedAllHandlers() { @Override protected void loopFinishedAllHandlers() { super.loopFinishedAllHandlers(); - if (!timerHandlers.isEmpty()) - timerHandlers.forEach(Threads::loopFinishedQuietly); - if (!daemonHandlers.isEmpty()) - daemonHandlers.forEach(Threads::loopFinishedQuietly); + finishHandlers(timerHandlers); + finishHandlers(daemonHandlers); } @Override @@ -134,6 +127,11 @@ protected void runDaemonHandlers() { runAllHandlers(daemonHandlers); } + private static void finishHandlers(List handlers) { + if (!handlers.isEmpty()) + handlers.forEach(Threads::loopFinishedQuietly); + } + private void runAllHandlers(List handlers) { for (int i = 0; i < handlers.size(); i++) { EventHandler handler = null; diff --git a/src/main/java/net/openhft/chronicle/threads/internal/EventLoopThreadHolder.java b/src/main/java/net/openhft/chronicle/threads/internal/EventLoopThreadHolder.java index 13d597983..cefe108f5 100644 --- a/src/main/java/net/openhft/chronicle/threads/internal/EventLoopThreadHolder.java +++ b/src/main/java/net/openhft/chronicle/threads/internal/EventLoopThreadHolder.java @@ -13,7 +13,6 @@ * than the configured monitoring interval. Each subsequent dump is spaced * further apart to reduce log volume while the loop remains stuck. */ - public class EventLoopThreadHolder implements ThreadHolder { private final CoreEventLoop eventLoop; private final long monitorIntervalNS; @@ -57,6 +56,7 @@ public boolean shouldLog(long nowNS) { @Override public void dumpThread(long startedNS, long nowNS) { long blockingTimeNS = nowNS - startedNS; + @SuppressWarnings("IntegerDivisionInFloatingPointContext") double blockingTimeMS = blockingTimeNS / 100_000 / 10.0; if (blockingTimeMS <= 0.0) return; diff --git a/src/main/java/net/openhft/chronicle/threads/internal/ThreadMonitorHarness.java b/src/main/java/net/openhft/chronicle/threads/internal/ThreadMonitorHarness.java index b35274684..7c85ede17 100644 --- a/src/main/java/net/openhft/chronicle/threads/internal/ThreadMonitorHarness.java +++ b/src/main/java/net/openhft/chronicle/threads/internal/ThreadMonitorHarness.java @@ -64,6 +64,8 @@ public boolean action() throws InvalidEventHandlerException { // Record lastActionCall time on every call to prevent false-positive "monitorThreadDelayed" reports long actionCallDelay = nowNS - this.lastActionCall; + if (actionCallDelay < 0) + actionCallDelay = 0; this.lastActionCall = nowNS; if (startedNS == 0 || startedNS == NOT_IN_A_LOOP) { diff --git a/src/main/java/net/openhft/chronicle/threads/internal/ThreadsThreadHolder.java b/src/main/java/net/openhft/chronicle/threads/internal/ThreadsThreadHolder.java index 0b5d105cb..6ae0cb0fc 100644 --- a/src/main/java/net/openhft/chronicle/threads/internal/ThreadsThreadHolder.java +++ b/src/main/java/net/openhft/chronicle/threads/internal/ThreadsThreadHolder.java @@ -57,7 +57,7 @@ public ThreadsThreadHolder(String description, long timeLimitNS, LongSupplier ti } @Override - public boolean isAlive() throws InvalidEventHandlerException { + public boolean isAlive() { return threadSupplier.get().isAlive(); } @@ -78,7 +78,8 @@ public long startedNS() { @Override public void monitorThreadDelayed(long actionCallDelayNS) { - logConsumer.accept("Monitor thread for " + getName() + " cpuId: " + Affinity.getCpu() + " was delayed by " + actionCallDelayNS / 100000 / 10.0 + " ms"); + logConsumer.accept("Monitor thread for " + getName() + " cpuId: " + Affinity.getCpu() + + " was delayed by " + nanosecondsToMillisWithTenthsPrecision(actionCallDelayNS) + " ms"); } @Override diff --git a/src/test/java/net/openhft/chronicle/threads/DiskSpaceMonitorTest.java b/src/test/java/net/openhft/chronicle/threads/DiskSpaceMonitorTest.java index 33ab49870..dca2660d4 100644 --- a/src/test/java/net/openhft/chronicle/threads/DiskSpaceMonitorTest.java +++ b/src/test/java/net/openhft/chronicle/threads/DiskSpaceMonitorTest.java @@ -44,9 +44,9 @@ void pollDiskSpace() { // todo investigate why this fails on arm assumeTrue(!Jvm.isArm()); System.setProperty("chronicle.disk.monitor.threshold.percent", "0"); - Map map = Jvm.recordExceptions(); assertEquals(0, DiskSpaceMonitor.INSTANCE.getThresholdPercentage()); DiskSpaceMonitor.INSTANCE.setThresholdPercentage(100); + final Map map = Jvm.recordExceptions(); for (int i = 0; i < 51; i++) { DiskSpaceMonitor.INSTANCE.pollDiskSpace(new File(".")); Jvm.pause(100); @@ -78,5 +78,4 @@ void ensureThatDiskSpaceMonitorRunsForMoreThanOneIteration() throws InterruptedE timeProvider.advanceMillis(Duration.ofHours(24).toMillis()); Thread.sleep(1000); } - } diff --git a/src/test/java/net/openhft/chronicle/threads/EventGroupHandlerTest.java b/src/test/java/net/openhft/chronicle/threads/EventGroupHandlerTest.java index fe3079588..79eb360ff 100644 --- a/src/test/java/net/openhft/chronicle/threads/EventGroupHandlerTest.java +++ b/src/test/java/net/openhft/chronicle/threads/EventGroupHandlerTest.java @@ -25,15 +25,15 @@ class EventGroupHandlerTest extends ThreadsTestCommon { void beforeAll() { ignoreException("Monitoring a task which has finished "); // Initial delay defaults to 10secs. Set to 10ms for testing. - MonitorEventLoop.MONITOR_INITIAL_DELAY_MS = 10; + setMonitorInitialDelayMs(10); } @AfterEach void afterEach() { - MonitorEventLoop.MONITOR_INITIAL_DELAY_MS = 10_000; + setMonitorInitialDelayMs(10_000); } - private final String EVENT_GROUP_NAME = "test"; + private static final String EVENT_GROUP_NAME = "test"; private EventGroup createEventGroup() { return EventGroup.builder().withName(EVENT_GROUP_NAME).withDaemon(true).build(); @@ -326,5 +326,4 @@ void testThrowingEventLoopAddedAfterStartBlocking() { void testThrowingEventLoopAddedAfterStartConcurrent() { addThrowingEventLoopAfterEventLoopStarted(new ThrowingHandler(HandlerPriority.CONCURRENT, true, false)); } - } diff --git a/src/test/java/net/openhft/chronicle/threads/EventGroupTest.java b/src/test/java/net/openhft/chronicle/threads/EventGroupTest.java index 5b0245464..23649cf6c 100644 --- a/src/test/java/net/openhft/chronicle/threads/EventGroupTest.java +++ b/src/test/java/net/openhft/chronicle/threads/EventGroupTest.java @@ -51,12 +51,12 @@ public class EventGroupTest extends ThreadsTestCommon { @BeforeEach void handlersInit() { ignoreException("Monitoring a task which has finished "); - MonitorEventLoop.MONITOR_INITIAL_DELAY_MS = 1; + setMonitorInitialDelayMs(1); } @Override public void preAfter() throws InterruptedException { - MonitorEventLoop.MONITOR_INITIAL_DELAY_MS = 10_000; + setMonitorInitialDelayMs(10_000); for (TestHandler handler : this.handlers) handler.assertClosed(); @@ -73,6 +73,7 @@ void testEventLoopName() { } } + @SuppressWarnings("InstantiatingAThreadWithDefaultRunMethod") @Timeout(5) @Test void testSimpleEventGroupTest() throws InterruptedException { @@ -326,7 +327,7 @@ void checkAllEventHandlerTypesStartInvalidEventHandlerException() throws Interru } private void checkException(ExceptionType exceptionType) throws InterruptedException { - try (final EventLoop eventGroup = EventGroup.builder().build();) { + try (final EventLoop eventGroup = EventGroup.builder().build()) { for (HandlerPriority hp : HandlerPriority.values()) eventGroup.addHandler(new TestHandler(hp, exceptionType)); eventGroup.start(); @@ -387,6 +388,7 @@ public boolean action() throws InvalidEventHandlerException { assertTrue(EventLoop.inEventLoop(), priority.name()); priorities.add(priority); } catch (Throwable t) { + //noinspection CallToPrintStackTrace t.printStackTrace(); } throw new InvalidEventHandlerException("done"); @@ -450,18 +452,18 @@ private void lifecycleEventsAreCalledAtAppropriateTimesByAppropriateThreads_ForP final TestHandler handler = new TestHandler(handlerPriority); eventGroup.addHandler(handler); } - handlers.forEach(handler -> assertEquals(handler.loopStartedNS.get(), 0, handler.priority + " was loopStarted before loop started, priorities=" + priorities)); + handlers.forEach(handler -> assertEquals(0, handler.loopStartedNS.get(), handler.priority + " was loopStarted before loop started, priorities=" + priorities)); eventGroup.start(); - handlers.forEach(handler -> assertEquals(handler.loopFinishedNS.get(), 0, handler.priority + " was loopFinished before loop finished, priorities=" + priorities)); + handlers.forEach(handler -> assertEquals(0, handler.loopFinishedNS.get(), handler.priority + " was loopFinished before loop finished, priorities=" + priorities)); Jvm.pause(1000); - handlers.forEach(handler -> assertNotEquals(handler.loopStartedNS.get(), 0, handler.priority + " was not loopStarted when loop started, priorities=" + priorities)); + handlers.forEach(handler -> assertNotEquals(0, handler.loopStartedNS.get(), handler.priority + " was not loopStarted when loop started, priorities=" + priorities)); eventGroup.close(); - handlers.forEach(handler -> assertNotEquals(handler.loopFinishedNS.get(), 0, handler.priority + " was not loopFinished when loop finished, priorities=" + priorities)); + handlers.forEach(handler -> assertNotEquals(0, handler.loopFinishedNS.get(), handler.priority + " was not loopFinished when loop finished, priorities=" + priorities)); } private static Stream> egCloseParams() { return Stream.of( - Arrays.asList(HandlerPriority.MEDIUM), + Collections.singletonList(HandlerPriority.MEDIUM), Arrays.asList(HandlerPriority.MEDIUM, HandlerPriority.HIGH), Arrays.asList(HandlerPriority.TIMER, HandlerPriority.HIGH), Arrays.asList(HandlerPriority.MEDIUM, HandlerPriority.BLOCKING, HandlerPriority.TIMER), diff --git a/src/test/java/net/openhft/chronicle/threads/EventLoopConcurrencyStressTest.java b/src/test/java/net/openhft/chronicle/threads/EventLoopConcurrencyStressTest.java index 29b0ed4f8..28b2d0348 100644 --- a/src/test/java/net/openhft/chronicle/threads/EventLoopConcurrencyStressTest.java +++ b/src/test/java/net/openhft/chronicle/threads/EventLoopConcurrencyStressTest.java @@ -107,15 +107,15 @@ String className() { private void canConcurrentlyAddHandlersAndStartEventLoop(EventLoopTestParameters parameters, HandlerPriority priority) { Jvm.startup().on(EventLoopConcurrencyStressTest.class, "Executing test for " + parameters.eventLoopClass.getSimpleName() + " at priority " + priority); - ExecutorService executorService = Executors.newCachedThreadPool(); + final ExecutorService executorService = Executors.newCachedThreadPool(); try (AbstractLifecycleEventLoop eventLoop = parameters.eventLoopSupplier.get()) { List handlerAdders = new ArrayList<>(); CyclicBarrier cyclicBarrier = new CyclicBarrier(NUM_EVENT_ADDERS + 1); final EventLoopStarter eventLoopStarter = new EventLoopStarter(eventLoop, cyclicBarrier); - executorService.submit(eventLoopStarter); + executorService.execute(eventLoopStarter); for (int i = 0; i < NUM_EVENT_ADDERS; i++) { final HandlerAdder handlerAdder = new HandlerAdder(eventLoop, cyclicBarrier, () -> new ControllableHandler(priority)); - executorService.submit(handlerAdder); + executorService.execute(handlerAdder); handlerAdders.add(handlerAdder); } // wait until the starter has started the event loop @@ -139,10 +139,10 @@ private void canConcurrentlyAddHandlersAndStopEventLoop(EventLoopTestParameters< eventLoop.start(); CyclicBarrier cyclicBarrier = new CyclicBarrier(NUM_EVENT_ADDERS + 1); final EventLoopStopper eventLoopStopper = new EventLoopStopper(eventLoop, cyclicBarrier); - executorService.submit(eventLoopStopper); + executorService.execute(eventLoopStopper); for (int i = 0; i < NUM_EVENT_ADDERS; i++) { final HandlerAdder handlerAdder = new HandlerAdder(eventLoop, cyclicBarrier, () -> new ControllableHandler(priority)); - executorService.submit(handlerAdder); + executorService.execute(handlerAdder); handlerAdders.add(handlerAdder); } eventLoopStopper.waitUntilEventLoopStopped(); @@ -160,10 +160,10 @@ private void canConcurrentlyAddTerminatingHandlersAndStartEventLoop(EventLoopTes List handlerAdders = new ArrayList<>(); CyclicBarrier cyclicBarrier = new CyclicBarrier(NUM_EVENT_ADDERS + 1); final EventLoopStarter eventLoopStarter = new EventLoopStarter(eventLoop, cyclicBarrier); - executorService.submit(eventLoopStarter); + executorService.execute(eventLoopStarter); for (int i = 0; i < NUM_EVENT_ADDERS; i++) { final HandlerAdder handlerAdder = new HandlerAdder(eventLoop, cyclicBarrier, () -> new ControllableHandler(priority, 0)); - executorService.submit(handlerAdder); + executorService.execute(handlerAdder); handlerAdders.add(handlerAdder); } // wait until the starter has started the event loop @@ -194,17 +194,17 @@ private void canConcurrentlyAddTerminatingHandlersAndStopEventLoop(EventLoopTest Jvm.startup().on(EventLoopConcurrencyStressTest.class, "Executing test for " + parameters.eventLoopClass.getSimpleName() + " at priority " + priority); ExecutorService executorService = Executors.newCachedThreadPool(); try (AbstractLifecycleEventLoop eventLoop = parameters.eventLoopSupplier.get()) { - List handlerAdders = new ArrayList<>(); eventLoop.start(); while (!eventLoop.isStarted()) { Jvm.pause(1); } CyclicBarrier cyclicBarrier = new CyclicBarrier(NUM_EVENT_ADDERS + 1); final EventLoopStopper eventLoopStopper = new EventLoopStopper(eventLoop, cyclicBarrier); - executorService.submit(eventLoopStopper); + executorService.execute(eventLoopStopper); + List handlerAdders = new ArrayList<>(); for (int i = 0; i < NUM_EVENT_ADDERS; i++) { final HandlerAdder handlerAdder = new HandlerAdder(eventLoop, cyclicBarrier, () -> new ControllableHandler(priority, 0)); - executorService.submit(handlerAdder); + executorService.execute(handlerAdder); handlerAdders.add(handlerAdder); } eventLoopStopper.waitUntilEventLoopStopped(); @@ -226,6 +226,7 @@ static final class EventLoopStarter implements Runnable { hasStartedEventLoop = new Semaphore(0); } + @SuppressWarnings("CallToPrintStackTrace") public void run() { try { await(cyclicBarrier); @@ -257,6 +258,7 @@ static final class EventLoopStopper implements Runnable { hasStoppedEventLoop = new Semaphore(0); } + @SuppressWarnings("CallToPrintStackTrace") public void run() { try { await(cyclicBarrier); @@ -295,6 +297,7 @@ static final class HandlerAdder implements Runnable { this.stoppedAddingHandlers = new Semaphore(0); } + @SuppressWarnings("CallToPrintStackTrace") @Override public void run() { try { @@ -433,7 +436,7 @@ private static void await(CyclicBarrier cyclicBarrier) { private static void pauseMicros(long timeToSleepMicros) { long endTimeNanos = System.nanoTime() + timeToSleepMicros * 1_000; while (System.nanoTime() < endTimeNanos) { - // do nothing + Thread.yield(); } } } diff --git a/src/test/java/net/openhft/chronicle/threads/EventLoopsTest.java b/src/test/java/net/openhft/chronicle/threads/EventLoopsTest.java index 855fe4750..56231909c 100644 --- a/src/test/java/net/openhft/chronicle/threads/EventLoopsTest.java +++ b/src/test/java/net/openhft/chronicle/threads/EventLoopsTest.java @@ -42,7 +42,7 @@ void stopAllCanHandleNulls() { final ExceptionHandler eh = (c, m, t) -> sb.append(m); ExceptionHandler exceptionHandler = Jvm.warn(); try { - Jvm.setWarnExceptionHandler(exceptionHandler); + Jvm.setWarnExceptionHandler(eh); EventLoops.stopAll(null, Arrays.asList(null, null, null), null); // Should silently accept nulls assertTrue(sb.toString().isEmpty()); diff --git a/src/test/java/net/openhft/chronicle/threads/LongPauserBenchmark.java b/src/test/java/net/openhft/chronicle/threads/LongPauserBenchmark.java index b3cb5adb0..9545ceedd 100644 --- a/src/test/java/net/openhft/chronicle/threads/LongPauserBenchmark.java +++ b/src/test/java/net/openhft/chronicle/threads/LongPauserBenchmark.java @@ -9,7 +9,7 @@ /** * Benchmark used to gauge the overhead of waking a {@link LongPauser}. - * + *

* A helper thread loops calling {@link LongPauser#pause()} and then yields. * The main thread repeatedly invokes {@link LongPauser#unpause()} a fixed * number of times and measures the elapsed time. Dividing the total by the diff --git a/src/test/java/net/openhft/chronicle/threads/LoopIntrospectionTest.java b/src/test/java/net/openhft/chronicle/threads/LoopIntrospectionTest.java index abccf6fc3..3f3277dca 100644 --- a/src/test/java/net/openhft/chronicle/threads/LoopIntrospectionTest.java +++ b/src/test/java/net/openhft/chronicle/threads/LoopIntrospectionTest.java @@ -21,6 +21,7 @@ class LoopIntrospectionTest extends ThreadsTestCommon { + @SuppressWarnings("InstantiatingAThreadWithDefaultRunMethod") @Test void mediumEventLoopReportsRunningThread() throws InterruptedException { AtomicReference loopThread = new AtomicReference<>(); @@ -31,18 +32,10 @@ void mediumEventLoopReportsRunningThread() throws InterruptedException { loop.start(); Waiters.waitForCondition("Medium loop did not start", loop::isStarted, 5_000); - loop.addHandler(new EventHandler() { - @Override - public @NotNull HandlerPriority priority() { - return HandlerPriority.MEDIUM; - } - - @Override - public boolean action() { - loopThread.compareAndSet(null, Thread.currentThread()); - firstInvocation.countDown(); - return false; - } + loop.addHandler(() -> { + loopThread.compareAndSet(null, Thread.currentThread()); + firstInvocation.countDown(); + return false; }); assertTrue(firstInvocation.await(5, TimeUnit.SECONDS), "Handler never ran on medium loop"); @@ -78,14 +71,15 @@ void blockingEventLoopReportsRunningThread() throws InterruptedException { } } + @SuppressWarnings("InstantiatingAThreadWithDefaultRunMethod") @Test - void eventGroupAggregatesRunningThreadChecks() throws InterruptedException { + void eventGroupAggregatesRunningThreadChecks() { AtomicReference highThread = new AtomicReference<>(); AtomicReference blockingThread = new AtomicReference<>(); AtomicReference monitorThread = new AtomicReference<>(); int previousDelay = MonitorEventLoop.MONITOR_INITIAL_DELAY_MS; - MonitorEventLoop.MONITOR_INITIAL_DELAY_MS = 1; + setMonitorInitialDelayMs(1); try (EventGroup group = EventGroup.builder() .withPriorities(EnumSet.of(HandlerPriority.HIGH, HandlerPriority.BLOCKING, HandlerPriority.MONITOR)) .withPauser(Pauser.balanced()) @@ -106,7 +100,7 @@ void eventGroupAggregatesRunningThreadChecks() throws InterruptedException { assertTrue(group.isRunningOnThread(monitorThread.get()), "Group did not recognise monitor loop thread"); assertFalse(group.isRunningOnThread(new Thread()), "Group matched unrelated thread"); } finally { - MonitorEventLoop.MONITOR_INITIAL_DELAY_MS = previousDelay; + setMonitorInitialDelayMs(previousDelay); } } @@ -140,7 +134,7 @@ private RecordingHandler(HandlerPriority priority, AtomicReference threa } @Override - public boolean action() throws InvalidEventHandlerException { + public boolean action() { threadRef.compareAndSet(null, Thread.currentThread()); return false; } diff --git a/src/test/java/net/openhft/chronicle/threads/MediumEventLoopTest.java b/src/test/java/net/openhft/chronicle/threads/MediumEventLoopTest.java index 0a8488828..51bd1b97e 100644 --- a/src/test/java/net/openhft/chronicle/threads/MediumEventLoopTest.java +++ b/src/test/java/net/openhft/chronicle/threads/MediumEventLoopTest.java @@ -3,7 +3,6 @@ */ package net.openhft.chronicle.threads; -import net.openhft.chronicle.core.io.InvalidMarshallableException; import net.openhft.chronicle.core.threads.EventHandler; import net.openhft.chronicle.core.threads.HandlerPriority; import net.openhft.chronicle.core.threads.InvalidEventHandlerException; @@ -54,18 +53,15 @@ void testAddingTwoEventHandlersWithBlockedMainLoopDoesNotHang() { try (MediumEventLoop eventLoop = new MediumEventLoop(null, "name", Pauser.balanced(), true, null)) { eventLoop.start(); CyclicBarrier barrier = new CyclicBarrier(3); - eventLoop.addHandler(new EventHandler() { - @Override - public boolean action() throws InvalidEventHandlerException, InvalidMarshallableException { - try { - barrier.await(); - return false; - } catch (InterruptedException e) { - Thread.currentThread().interrupt(); - throw new InvalidEventHandlerException(); - } catch (BrokenBarrierException e) { - throw new InvalidEventHandlerException(); - } + eventLoop.addHandler(() -> { + try { + barrier.await(); + return false; + } catch (InterruptedException e) { + Thread.currentThread().interrupt(); + throw new InvalidEventHandlerException(); + } catch (BrokenBarrierException e) { + throw new InvalidEventHandlerException(); } }); IntStream.range(0, 2).parallel() diff --git a/src/test/java/net/openhft/chronicle/threads/PauserTest.java b/src/test/java/net/openhft/chronicle/threads/PauserTest.java index 2fa03c76b..3229f4279 100644 --- a/src/test/java/net/openhft/chronicle/threads/PauserTest.java +++ b/src/test/java/net/openhft/chronicle/threads/PauserTest.java @@ -20,7 +20,6 @@ * {@link BusyPauser} is additionally checked to confirm it does not record * pause counts and rejects timed pauses. */ - class PauserTest extends ThreadsTestCommon { @Test @@ -42,6 +41,7 @@ void busy() throws TimeoutException { try { pauser.pause(1, TimeUnit.MILLISECONDS); } catch (UnsupportedOperationException ignored) { + // BusyPauser does not support timed pauses; expected. } assertEquals(0, pauser.countPaused()); pauser.unpause(); diff --git a/src/test/java/net/openhft/chronicle/threads/PauserTimeoutTest.java b/src/test/java/net/openhft/chronicle/threads/PauserTimeoutTest.java index b252078f2..6a287e4a6 100644 --- a/src/test/java/net/openhft/chronicle/threads/PauserTimeoutTest.java +++ b/src/test/java/net/openhft/chronicle/threads/PauserTimeoutTest.java @@ -20,15 +20,15 @@ * timeout is supplied. */ class PauserTimeoutTest extends ThreadsTestCommon { - private Pauser[] pausersSupportTimeout = { + private final Pauser[] pausersSupportTimeout = { Pauser.balanced(), Pauser.sleepy(), new BusyTimedPauser(), new YieldingPauser(0), new LongPauser(0, 0, 1, 10, TimeUnit.MILLISECONDS), -// new MilliPauser(1) + // new MilliPauser(1) }; - private Pauser[] pausersDontSupportTimeout = { + private final Pauser[] pausersDontSupportTimeout = { BusyPauser.INSTANCE}; /** @@ -42,12 +42,16 @@ void pausersSupportTimeout() { int timeoutNS = 100_000_000; for (Pauser p : pausersSupportTimeout) { long start = System.nanoTime(); - do try { - p.pause(timeoutNS, TimeUnit.NANOSECONDS); - } catch (TimeoutException e) { - fail(p + " timed out"); + do { + try { + p.pause(timeoutNS, TimeUnit.NANOSECONDS); + } catch (TimeoutException e) { + fail(p + " timed out"); + } } while (System.nanoTime() < start + timeoutNS / 2); - while (System.nanoTime() < start + timeoutNS * 5 / 4) ; + while (System.nanoTime() < start + timeoutNS * 5 / 4) { + Thread.yield(); + } try { p.pause(timeoutNS, TimeUnit.NANOSECONDS); } catch (TimeoutException e) { diff --git a/src/test/java/net/openhft/chronicle/threads/StopVCloseTest.java b/src/test/java/net/openhft/chronicle/threads/StopVCloseTest.java index 37af45f79..c497ef626 100644 --- a/src/test/java/net/openhft/chronicle/threads/StopVCloseTest.java +++ b/src/test/java/net/openhft/chronicle/threads/StopVCloseTest.java @@ -36,12 +36,12 @@ public class StopVCloseTest extends ThreadsTestCommon { @BeforeEach void handlersInit() { ignoreException("Monitoring a task which has finished "); - MonitorEventLoop.MONITOR_INITIAL_DELAY_MS = 1; + setMonitorInitialDelayMs(1); } @Override public void preAfter() { - MonitorEventLoop.MONITOR_INITIAL_DELAY_MS = 10_000; + setMonitorInitialDelayMs(10_000); } @Test @@ -88,6 +88,7 @@ public void loopFinished() { } } + @SuppressWarnings("CallToPrintStackTrace") @Test void blockingStopped() throws InterruptedException { BlockingEventLoop bel = new BlockingEventLoop("blocking"); diff --git a/src/test/java/net/openhft/chronicle/threads/ThreadsTest.java b/src/test/java/net/openhft/chronicle/threads/ThreadsTest.java index 4f664dc84..b057134a8 100644 --- a/src/test/java/net/openhft/chronicle/threads/ThreadsTest.java +++ b/src/test/java/net/openhft/chronicle/threads/ThreadsTest.java @@ -18,7 +18,7 @@ class ThreadsTest extends ThreadsTestCommon { void shouldDumpStackTracesForStuckDelegatedExecutors() { final AtomicBoolean running = new AtomicBoolean(true); final ExecutorService service = Executors.newSingleThreadExecutor(new NamedThreadFactory("non-daemon-test")); - service.submit(() -> { + service.execute(() -> { while (running.get()) { Jvm.pause(10L); } @@ -35,7 +35,7 @@ void shouldDumpStackTracesForStuckDelegatedExecutors() { void shouldDumpStackTracesForStuckDaemonDelegatedExecutors() { final AtomicBoolean running = new AtomicBoolean(true); final ExecutorService service = Executors.newSingleThreadExecutor(new NamedThreadFactory("daemon-test")); - service.submit(() -> { + service.execute(() -> { while (running.get()) { Jvm.pause(10L); } @@ -58,7 +58,7 @@ void shouldDumpStackTracesForStuckNestedDelegatedExecutors() { ) ) ); - service.submit(() -> { + service.execute(() -> { while (running.get()) { Jvm.pause(10L); } diff --git a/src/test/java/net/openhft/chronicle/threads/ThreadsTestCommon.java b/src/test/java/net/openhft/chronicle/threads/ThreadsTestCommon.java index 5382d8b5e..1e77cdd0e 100644 --- a/src/test/java/net/openhft/chronicle/threads/ThreadsTestCommon.java +++ b/src/test/java/net/openhft/chronicle/threads/ThreadsTestCommon.java @@ -23,7 +23,7 @@ public class ThreadsTestCommon { private final Map, String> ignoreExceptions = new LinkedHashMap<>(); - private Map, String> expectedExceptions = new LinkedHashMap<>(); + private final Map, String> expectedExceptions = new LinkedHashMap<>(); private ThreadDump threadDump; private Map exceptions; @@ -108,7 +108,7 @@ private void assertExceptionThrown(Predicate predicate, String des @AfterEach public void afterChecks() throws InterruptedException { preAfter(); - SystemTimeProvider.CLOCK = SystemTimeProvider.INSTANCE; + resetSystemTimeProviderClock(); CleaningThread.performCleanup(Thread.currentThread()); System.gc(); @@ -116,13 +116,23 @@ public void afterChecks() throws InterruptedException { assertReferencesReleased(); checkThreadDump(); checkExceptions(); - - tearDown(); } void preAfter() throws InterruptedException { } - private void tearDown() { + /** + * Test-only helper to adjust the initial monitor delay in a single place. + * This keeps static mutations out of instance lifecycle methods for SpotBugs. + */ + protected static void setMonitorInitialDelayMs(int delayMillis) { + MonitorEventLoop.MONITOR_INITIAL_DELAY_MS = delayMillis; + } + + /** + * Resets the global SystemTimeProvider clock to the default instance. + */ + protected static void resetSystemTimeProviderClock() { + SystemTimeProvider.CLOCK = SystemTimeProvider.INSTANCE; } } diff --git a/src/test/java/net/openhft/chronicle/threads/VanillaEventLoopTest.java b/src/test/java/net/openhft/chronicle/threads/VanillaEventLoopTest.java index ec96b663c..4fd7ac49b 100644 --- a/src/test/java/net/openhft/chronicle/threads/VanillaEventLoopTest.java +++ b/src/test/java/net/openhft/chronicle/threads/VanillaEventLoopTest.java @@ -3,7 +3,6 @@ */ package net.openhft.chronicle.threads; -import net.openhft.chronicle.core.io.InvalidMarshallableException; import net.openhft.chronicle.core.threads.EventHandler; import net.openhft.chronicle.core.threads.HandlerPriority; import net.openhft.chronicle.core.threads.InvalidEventHandlerException; @@ -46,18 +45,15 @@ void testAddingTwoEventHandlersWithBlockedMainLoopDoesNotHang() { try (VanillaEventLoop eventLoop = new VanillaEventLoop(null, "name", Pauser.balanced(), 1000L, true, null, VanillaEventLoop.ALLOWED_PRIORITIES)) { eventLoop.start(); CyclicBarrier barrier = new CyclicBarrier(3); - eventLoop.addHandler(new EventHandler() { - @Override - public boolean action() throws InvalidEventHandlerException, InvalidMarshallableException { - try { - barrier.await(); - return false; - } catch (InterruptedException e) { - Thread.currentThread().interrupt(); - throw new InvalidEventHandlerException(); - } catch (BrokenBarrierException e) { - throw new InvalidEventHandlerException(); - } + eventLoop.addHandler(() -> { + try { + barrier.await(); + return false; + } catch (InterruptedException e) { + Thread.currentThread().interrupt(); + throw new InvalidEventHandlerException(); + } catch (BrokenBarrierException e) { + throw new InvalidEventHandlerException(); } }); IntStream.range(0, 2).parallel() @@ -75,6 +71,7 @@ public boolean action() throws InvalidEventHandlerException, InvalidMarshallable () -> eventLoop.mediumHandlersArray.length == 3, 1000); } } + assertTrue(true); // If we reach here, the test passed } private void addingHandlerBeforeStart(CountingHandler handler) { diff --git a/src/test/java/net/openhft/chronicle/threads/example/SingleAndMultiThreadedExample.java b/src/test/java/net/openhft/chronicle/threads/example/SingleAndMultiThreadedExample.java index 5b6369697..4d7129067 100644 --- a/src/test/java/net/openhft/chronicle/threads/example/SingleAndMultiThreadedExample.java +++ b/src/test/java/net/openhft/chronicle/threads/example/SingleAndMultiThreadedExample.java @@ -21,7 +21,7 @@ */ public class SingleAndMultiThreadedExample { - private AtomicLong multiThreadedValue = new AtomicLong(); + private final AtomicLong multiThreadedValue = new AtomicLong(); private long singleThreadedValue; /** @@ -50,17 +50,13 @@ private void multiThreadedExample() throws ExecutionException, InterruptedExcept // example using Java Threads final ExecutorService executorService = newCachedThreadPool(); - Future f1 = executorService.submit(this::addOneHundred); - Future f2 = executorService.submit(this::addOneHundred); - Future f3 = executorService.submit(this::addOneHundred); - Future f4 = executorService.submit(this::addOneHundred); - Future f5 = executorService.submit(this::addOneHundred); - - f1.get(); - f2.get(); - f3.get(); - f4.get(); - f5.get(); + final Future[] futures = new Future[5]; + for (int i = 0; i < futures.length; i++) { + futures[i] = executorService.submit(this::addOneHundred); + } + for (Future future : futures) { + future.get(); + } System.out.println("multiThreadedValue=" + multiThreadedValue); } diff --git a/src/test/java/net/openhft/chronicle/threads/internal/EventLoopStateRendererTest.java b/src/test/java/net/openhft/chronicle/threads/internal/EventLoopStateRendererTest.java index f386e74b6..caf86684b 100644 --- a/src/test/java/net/openhft/chronicle/threads/internal/EventLoopStateRendererTest.java +++ b/src/test/java/net/openhft/chronicle/threads/internal/EventLoopStateRendererTest.java @@ -31,7 +31,7 @@ void testCanRenderMediumEventLoop() { assertTrue(dump.contains("Closed: false")); assertTrue(dump.contains("Closing: false")); assertTrue(dump.contains("Lifecycle: STARTED")); - assertTrue(dump.contains("Thread state: ")); + assertThreadDetailsPresent(dump); } } @@ -52,7 +52,7 @@ void testCanRenderStoppedMediumEventLoop() { assertTrue(dump.contains("Closed: false")); assertTrue(dump.contains("Closing: false")); assertTrue(dump.contains("Lifecycle: STOPPED")); - assertTrue(dump.contains("Thread state: ")); + assertThreadDetailsPresent(dump); } } @@ -99,4 +99,8 @@ void testCanRenderEventGroup() { assertTrue(dump.contains("Lifecycle: STARTED")); } } + + private static void assertThreadDetailsPresent(String dump) { + assertTrue(dump.contains("Thread state: ") || dump.contains("Thread is null")); + } } diff --git a/src/test/java/net/openhft/chronicle/threads/internal/ThreadMonitorHarnessTest.java b/src/test/java/net/openhft/chronicle/threads/internal/ThreadMonitorHarnessTest.java index 280a5d3a8..40a9234ca 100644 --- a/src/test/java/net/openhft/chronicle/threads/internal/ThreadMonitorHarnessTest.java +++ b/src/test/java/net/openhft/chronicle/threads/internal/ThreadMonitorHarnessTest.java @@ -31,7 +31,7 @@ class ThreadMonitorHarnessTest { private LongSupplier timeSupplier; @BeforeEach - void setUp() throws InvalidEventHandlerException { + void setUp() { threadMonitorHarness = new ThreadMonitorHarness(threadHolder, timeSupplier); lenient().when(threadHolder.isAlive()).thenReturn(true); lenient().when(threadHolder.timingToleranceNS()).thenReturn(TIMING_TOLERANCE_NS); @@ -39,7 +39,7 @@ void setUp() throws InvalidEventHandlerException { } @Test - void willCallThreadFinishedThenTerminateWhenThreadIsNoLongerAlive() throws InvalidEventHandlerException { + void willCallThreadFinishedThenTerminateWhenThreadIsNoLongerAlive() { when(threadHolder.isAlive()).thenReturn(false); assertThrows(InvalidEventHandlerException.class, () -> threadMonitorHarness.action()); diff --git a/systemProperties.adoc b/systemProperties.adoc index eb782dd8a..fbe86f25d 100644 --- a/systemProperties.adoc +++ b/systemProperties.adoc @@ -1,24 +1,30 @@ += Chronicle Threads System Properties +:toc: +:lang: en-GB +:source-highlighter: rouge + == System Properties -Chronicle Threads reads several system properties at start up. +Chronicle Threads reads several system properties at start-up. These values tune event loops, pausing strategies, monitoring intervals and disk space checks. All properties may be supplied on the command line with `-D` flags. -NOTE: All boolean properties below are read using link:https://javadoc.io/static/net.openhft/chronicle-core/2.23ea13/net/openhft/chronicle/core/Jvm.html#getBoolean-java.lang.String-[net.openhft.chronicle.core.Jvm.getBoolean(java.lang.String)], and so are enabled if either `-Dflag` or `-Dflag=true` or `-Dflag=yes`. +NOTE: All boolean properties below are read using `net.openhft.chronicle.core.Jvm.getBoolean(String)`. +A property is considered enabled if it is present (for example `-Dflag`) or set to `true`/`yes`. === Disk monitoring -[cols=4*,options="header"] +[cols="2a,1,3a,2a",options="header"] |=== | Property Key | Default | Description | Java Variable Name (Type) -| chronicle.disk.monitor.disable | `false` | Disable the background disk space monitor | _DISABLED_ (boolean) -| chronicle.disk.monitor.threshold.percent | 5% | Issue warnings when free space drops below this percentage | _thresholdPercentage_ (int) +| chronicle.disk.monitor.disable | `false` | Disable the background disk space monitor (see decision link:src/main/docs/decision-log.adoc#THR-OPS-003[THR-OPS-003]) | _DISABLED_ (boolean) +| chronicle.disk.monitor.threshold.percent | 5% | Issue warnings when free space drops below this percentage (see decision link:src/main/docs/decision-log.adoc#THR-OPS-003[THR-OPS-003]) | _thresholdPercentage_ (int) | disk.monitor.deleted.warning | `false` | Warn if disk space cannot be determined | _WARN_DELETED_ (boolean) |=== === Event loops -[cols=4*,options="header"] +[cols="2a,1,3a,2a",options="header"] |=== | Property Key | Default | Description | Java Variable Name (Type) | eventloop.accept.mod | 128 | Prevent starvation by inserting new handlers every modulo iteration | _ACCEPT_HANDLER_MOD_COUNT_ (int) @@ -30,7 +36,7 @@ NOTE: All boolean properties below are read using link:https://javadoc.io/static === Pausers -[cols=4*,options="header"] +[cols="2a,1,3a,2a",options="header"] |=== | Property Key | Default | Description | Java Variable Name (Type) | pauser.minProcessors | 4 | Minimum number of processors required before busy pausing is used | _MIN_PROCESSORS_ (int) @@ -38,7 +44,7 @@ NOTE: All boolean properties below are read using link:https://javadoc.io/static === Monitoring -[cols=4*,options="header"] +[cols="2a,1,3a,2a",options="header"] |=== | Property Key | Default | Description | Java Variable Name (Type) | disableLoopBlockMonitor | `false` | Disable loop block monitoring | _ENABLE_LOOP_BLOCK_MONITOR_ (boolean)