Skip to content

Conversation

ritvikrao
Copy link
Collaborator

Implements support for partitions (multiple instances of Charm++ running on the same communication and machine layer), including command-line definitions of partitions and messaging functions between partitions. Does not include topological-aware partitions.

@@ -929,4 +915,95 @@ extern "C" {

void registerTraceInit(void (*fn)(char **argv));

//partitions
#if CMK_HAS_PARTITION
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the CMK_HAS_PARTITION defined?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pushed a commit to add the definition in charm-config.h. It's really only needed in the charm build so I put it there.

Copy link

@ericjbohm ericjbohm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it should work. Has it been tested with anything that actually uses partitions?

@ritvikrao
Copy link
Collaborator Author

I am trying to run the partition test located in charm/tests/charm++/partitions, but the example command doesn't seem to make sense to me. It is ./hello +pe 4 10 2 +partitions 2. But that command is inherently on 1 node, so how can you request 2 partitions? As a result I get this error
------- Processor 0 Exiting: Called CmiAbort ------ Reason: Number of partitions does not evenly divide number of processes. Aborting

@ericjbohm
Copy link

I am trying to run the partition test located in charm/tests/charm++/partitions, but the example command doesn't seem to make sense to me. It is ./hello +pe 4 10 2 +partitions 2. But that command is inherently on 1 node, so how can you request 2 partitions? As a result I get this error ------- Processor 0 Exiting: Called CmiAbort ------ Reason: Number of partitions does not evenly divide number of processes. Aborting

It needs to be able to slice along process boundaries. So, if that was a non-smp build, that command line would work. You would need a more carefully constructed line with +p and +ppn to do the same in an smp build. It wouldn't work at all in multicore.

@ritvikrao
Copy link
Collaborator Author

That test program requires creating P/2 partitions if you have P PEs. I ran the test with 2 PEs on 1 process in standalone mode, and with 4 PEs on 2 processes with LCI, and both worked.

@ericjbohm ericjbohm added the enhancement New feature or request label Sep 15, 2025
Copy link

@ericjbohm ericjbohm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@ritvikrao ritvikrao merged commit e0370c3 into main Sep 17, 2025
2 checks passed
@ritvikrao ritvikrao deleted the partitions branch September 17, 2025 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants