|
2 | 2 | ### General Motivation |
3 | 3 |
|
4 | 4 | Introduce the Labels mechanism. Give Labels to Actors/Tasks/Nodes/Objects. |
5 | | -Affinity features such as ActorAffinity/TaskAffinity/NodeAffinity can be realized through Labels. |
| 5 | +Affinity features such as ActorAffinity/NodeAffinity can be realized through Labels. |
6 | 6 |
|
7 | 7 |
|
8 | 8 | ### Should this change be within `ray` or outside? |
@@ -49,17 +49,12 @@ The apis of the actor-affinity/task-affinity/node-affinity scheduling. |
49 | 49 | SchedulingStrategyT = Union[None, str, |
50 | 50 | PlacementGroupSchedulingStrategy, |
51 | 51 | ActorAffinitySchedulingStrategy, |
52 | | - TaskAffinitySchedulingStrategy, |
53 | 52 | NodeAffinitySchedulingStrategy] |
54 | 53 |
|
55 | 54 | class ActorAffinitySchedulingStrategy: |
56 | 55 | def __init__(self, match_expressions: List[LabelMatchExpression]): |
57 | 56 | self.match_expressions = match_expressions |
58 | 57 |
|
59 | | -class TaskAffinitySchedulingStrategy: |
60 | | - def __init__(self, match_expressions: List[LabelMatchExpression]): |
61 | | - self.match_expressions = match_expressions |
62 | | - |
63 | 58 | class NodeAffinitySchedulingStrategy: |
64 | 59 | def __init__(self, match_expressions: List[LabelMatchExpression]): |
65 | 60 | self.match_expressions = match_expressions |
@@ -87,7 +82,112 @@ actor_1 = Actor.options(scheduling_strategy=ActorAffinitySchedulingStrategy([ |
87 | 82 | ])).remote() |
88 | 83 | ``` |
89 | 84 |
|
| 85 | +### Example |
| 86 | + |
| 87 | +* Affinity |
| 88 | + * Co-locate the actors in the same batch of nodes, like nodes in the same zones |
| 89 | +* Anti-affinity |
| 90 | + * Spread the actors of a service across nodes and/or availability zones, e.g. to reduce correlated failures. |
| 91 | + |
| 92 | +**1. Spread Demo** |
| 93 | + |
| 94 | + |
| 95 | + |
| 96 | +``` |
| 97 | +@ray.remote |
| 98 | +Class Cat: |
| 99 | + pass |
| 100 | +
|
| 101 | +cats = [] |
| 102 | +for i in range(4): |
| 103 | + cat = Actor.options( |
| 104 | + labels = {"type": "cat"}, |
| 105 | + scheduling_strategy=ActorAffinitySchedulingStrategy([ |
| 106 | + LabelMatchExpression( |
| 107 | + "type", LabelMatchOperator.NOT_IN, ["cat"], False) |
| 108 | + ])).remote() |
| 109 | + cats.apend(cat) |
| 110 | +``` |
| 111 | + |
| 112 | +**2. Co-locate Demo** |
| 113 | + |
| 114 | + |
90 | 115 |
|
| 116 | +``` |
| 117 | +@ray.remote |
| 118 | +Class Dog: |
| 119 | + pass |
| 120 | +
|
| 121 | +dogs = [] |
| 122 | +# First schedule a dog to a random node. |
| 123 | +dog_1 = Dog.options(labels={"type":"dog"}).remote() |
| 124 | +dogs.apend(dog_1) |
| 125 | +
|
| 126 | +# Then schedule the remaining dogs to the same node as the first dog. |
| 127 | +for i in range(3): |
| 128 | + dog = Actor.options(scheduling_strategy=ActorAffinitySchedulingStrategy([ |
| 129 | + LabelMatchExpression( |
| 130 | + "type", LabelMatchOperator.IN, ["dog"], False) |
| 131 | + ])).remote() |
| 132 | + dogs.apend(dog) |
| 133 | +``` |
| 134 | + |
| 135 | +**2. Collocate and spread combination demo** |
| 136 | + |
| 137 | + |
| 138 | + |
| 139 | +``` |
| 140 | +@ray.remote |
| 141 | +Class Cat: |
| 142 | + pass |
| 143 | +
|
| 144 | +@ray.remote |
| 145 | +Class Dog: |
| 146 | + pass |
| 147 | +
|
| 148 | +# First schedule cat to each node. |
| 149 | +cats = [] |
| 150 | +for i in range(4): |
| 151 | + cat = Actor.options( |
| 152 | + labels = { |
| 153 | + "type": "cat", |
| 154 | + "id": "cat-" + str(i) |
| 155 | + }, |
| 156 | + scheduling_strategy=ActorAffinitySchedulingStrategy([ |
| 157 | + LabelMatchExpression( |
| 158 | + "type", LabelMatchOperator.NOT_IN, ["cat"], False) |
| 159 | + ])).remote() |
| 160 | + cats.apend(cat) |
| 161 | +
|
| 162 | +# Then each node schedules 3 dogs. |
| 163 | +dogs = [] |
| 164 | +for i in range(4): |
| 165 | + node_dogs = [] |
| 166 | + for i in range(3): |
| 167 | + dog = Actor.options( |
| 168 | + labels = { |
| 169 | + "type": "dog", |
| 170 | + }, |
| 171 | + scheduling_strategy=ActorAffinitySchedulingStrategy([ |
| 172 | + LabelMatchExpression( |
| 173 | + "id", LabelMatchOperator.IN, ["cat-" + str(i)], False) |
| 174 | + ])).remote() |
| 175 | + node_dogs.apend(dog) |
| 176 | + dogs.apend(node_dogs) |
| 177 | +``` |
| 178 | + |
| 179 | +### Note |
| 180 | +1. Actor/NodeAffinity can be used together with the CustomResource mechanism. |
| 181 | +These two mechanisms are completely non-conflicting. |
| 182 | +eg: |
| 183 | +``` |
| 184 | +actor_1 = Actor.options( |
| 185 | + resources={"4c8g": 1}, |
| 186 | + scheduling_strategy=ActorAffinitySchedulingStrategy([ |
| 187 | + LabelMatchExpression( |
| 188 | + "location", LabelMatchOperator.IN, ["dc_1"], False) |
| 189 | + ])).remote() |
| 190 | +``` |
91 | 191 | ### Implementation plan |
92 | 192 |
|
93 | 193 |  |
@@ -175,7 +275,7 @@ podAntiAffinity | POD | In, NotIn, Exists, DoesNotExist |
175 | 275 |
|
176 | 276 | ### what's the alternative to achieve the same goal? |
177 | 277 | **Option 2: Use LabelAffinitySchedulingStrategy instead of Actor/Task/NodeAffinitySchedulingStrategy** |
178 | | -Some people think that ActorAffinity/TaskAffinity is dispatched to the Node corresponding to the actor/Task with these labels. |
| 278 | +Some people think that ActorAffinity is dispatched to the Node corresponding to the actor/Task with these labels. |
179 | 279 | Why not assign both ActorLabels and TaskLabels to Node? |
180 | 280 | Then the scheduling API only needs to use the LabelAffinitySchedulingStrategy set of APIs to instead of Actor/Task/NodeAffinitySchedulingStrategy. |
181 | 281 |
|
@@ -222,7 +322,7 @@ Advantages: |
222 | 322 | Example: |
223 | 323 | The user wants to affinity schedule to <b> some Actors and nodes in a special computer room. </b> |
224 | 324 | However, according to the results of internal user research, most of the requirements are just to realize Actor/Task "collocate" scheduling or spread scheduling. |
225 | | -So using a single ActorAffinity/TaskAffinity/NodeAffinity can already achieve practical effects. |
| 325 | +So using a single ActorAffinity/NodeAffinity can already achieve practical effects. |
226 | 326 |
|
227 | 327 | And the same effect can be achieved by combining the option 1 with custom resources |
228 | 328 |
|
@@ -276,4 +376,7 @@ class ActorAffinitySchedulingStrategy: |
276 | 376 | ``` |
277 | 377 |
|
278 | 378 | ### 2. ObjectAffinitySchedulingStrategy |
279 | | -If the user has a request, you can consider adding the attributes of labels to objects. Then the strategy of ObjectAffinity can be launched。 |
| 379 | +If the user has a request, you can consider adding the attributes of labels to objects. Then the strategy of ObjectAffinity can be launched。 |
| 380 | + |
| 381 | +### 3. TaskAffinitySchedulingStrategy |
| 382 | +Because the resource synchronization mechanism of Label has been implemented above. Therefore, it is easy to create a TaskAffinity strategy for Task. |
0 commit comments