Difference between revisions of "Synchronization"
(→Synchronization Core) |
|||
Line 1: | Line 1: | ||
− | |||
− | |||
The nu+ manycore supports an hardware mechanism for Synchronization. The mechanism of synchronization is based on Barrier primitive. A barrier for a group of threads in the source code means any thread must stop at this point and cannot proceed until all other threads reach this barrier. | The nu+ manycore supports an hardware mechanism for Synchronization. The mechanism of synchronization is based on Barrier primitive. A barrier for a group of threads in the source code means any thread must stop at this point and cannot proceed until all other threads reach this barrier. | ||
The mechanism of Barrier Synchronization implemented in this system is based on master-slave architecture, with master distributed on tiles. We support more independent barriers in parallel execution on manycore in each time. For identifying the different barriers we have an ID for each Barrier. Each Synchronization Core Master handles a range of Barrier in manycore. | The mechanism of Barrier Synchronization implemented in this system is based on master-slave architecture, with master distributed on tiles. We support more independent barriers in parallel execution on manycore in each time. For identifying the different barriers we have an ID for each Barrier. Each Synchronization Core Master handles a range of Barrier in manycore. | ||
Line 45: | Line 43: | ||
=== Barrier Core === | === Barrier Core === | ||
+ | |||
=== Synchronization Core === | === Synchronization Core === | ||
+ | [[File:Sync core.png]] | ||
The Synchronization Core is the key component of our solution. This module acts as the synchronization master, but unlike other hardware synchronization architectures, it is distributed among all tiles in the manycore. As explained above, the Boot Setup module selects a specific Synchronization Core based on the barrier ID set by the user. In this way, the architecture spreads the synchronization messages all over the manycore. By using | The Synchronization Core is the key component of our solution. This module acts as the synchronization master, but unlike other hardware synchronization architectures, it is distributed among all tiles in the manycore. As explained above, the Boot Setup module selects a specific Synchronization Core based on the barrier ID set by the user. In this way, the architecture spreads the synchronization messages all over the manycore. By using | ||
this approach, the synchronization master is no longer a network congestion point. Figure2 shows a simplified view of the Synchronization Core. The module is made of three stages: the first stage selects and schedules the Setup and the Account requests. The selected request steps into the second stage, which strips the control information from the message. In the last stage, the stripped request is finally processed. If it is a Setup request, the counter is initialized with the number of involved threads. On the other hand, if the request is an Account message, the counter is decremented by 1. When the counter is 0, all the involved cores have hit the synchronization point, and the master sends a multicast Release message reaching all of them. | this approach, the synchronization master is no longer a network congestion point. Figure2 shows a simplified view of the Synchronization Core. The module is made of three stages: the first stage selects and schedules the Setup and the Account requests. The selected request steps into the second stage, which strips the control information from the message. In the last stage, the stripped request is finally processed. If it is a Setup request, the counter is initialized with the number of involved threads. On the other hand, if the request is an Account message, the counter is decremented by 1. When the counter is 0, all the involved cores have hit the synchronization point, and the master sends a multicast Release message reaching all of them. |
Revision as of 12:13, 27 March 2019
The nu+ manycore supports an hardware mechanism for Synchronization. The mechanism of synchronization is based on Barrier primitive. A barrier for a group of threads in the source code means any thread must stop at this point and cannot proceed until all other threads reach this barrier. The mechanism of Barrier Synchronization implemented in this system is based on master-slave architecture, with master distributed on tiles. We support more independent barriers in parallel execution on manycore in each time. For identifying the different barriers we have an ID for each Barrier. Each Synchronization Core Master handles a range of Barrier in manycore.
Contents
Barrier Synchronization Protocol
The Barrier protocol is based on a message passing. We have three type of message:
- Setup_Message: It configures the struct of Barrier in Synchronization Core;
- Account_Message: It notifies to Synchronization Core that the thread is arrived to point of Barrier;
- Release_Message: It notifies from Synchronization Core to Barrier Core that all threads are arrived to point of Synchronization, so the Barrier can end.
In this protocol the Setup_Message and Account_Message are out-order and we use an enable bit for managing the Setup. The size of message change with number of Barrier Synchronization supported, as view in figure:
The size of each message is of 1 flit (64 bit), and the field size of message change with number of Barrier supported. We have these fields in the messages:
- Type: It specifies the type of message: '00'(Setup_Message), '01'(Account_Message), '10'(Release_message). The size is 2 bits;
- Counter: It specifies the number of Hardware Threads that takes part in the Barrier Synchronization. The size is on N bits, where N is log2 of total number of Hardware Threads in the manycore;
- Barrier ID: It identifies the Barrier. The size si on M bits, where M is the log2 of total number of Barrier supported by architecture; it is defined in the file(indicare il file).
- Tile Source: It specifies the ID of Tile Source of Barrier. We use it for destination multicast of Release Message. The size is T and it is log2 of total number of tiles in the manycore.
Example of Barrier
For explaining the protocol of Barrier Synchronization, we describe the example in Figure(below). In this example we have 8 tiles nu+, with two different groups of thread and one Barrier Synchronization for each group. In the group 1 we have the threads between tile 0 to tile 3, and in the group 2 the other tiles. About group 1 we have the Barrier with the ID number 26, whereas about group 2 we have Barrier with the ID number 43. The Barrier ID 26 and 43 are handled respectively by Synchronization Core in tile 3 and 5(in each tile we have a Synchronization Core that handles 16 Barrier ID).
In the example the Host Manager boots the kernel and prepares the Setup Message to set Barrier and it sends it to Synchronization Core master. Then the Synchronization Core waits all the Account Messages by Threads and update the Counter of Barrier for each Account Message. When alla Account Messages arrive to Synchronization Core it notify to them that Barrier Synchronization is terminated with Release Message.
Boot Setup
The Boot Setup allows the communication between the Host and nu+ manycore. It is placed in the "tile h2c". The host sends the command of Barrier with dates about Barrier. Then, the Boot Setup generates the message of Setup to Synchronization Core. The Host Message of Setup of Barrier Synchronization changes with size of manycore and number of Barrier supported.
inserire immagine messagio host->messaggio Setup sulla rete. 5 The Boot Setup to generate the Setup_Message implements an FSM of 5 states:
- START_SETUP: It prepares the Boot Setup to receive the message of Boot by Host;
- IDLE_SETUP: It waits the command of Barrier by Host. When arrives a Barrier Command, it holds the communication channel with host by wait_sync signal and update next state to SETUP_SERVICE;
- SETUP_SERVICE:
- WAIT_NET_SETUP:
- SEND_SETUP:
Barrier Core
Synchronization Core
The Synchronization Core is the key component of our solution. This module acts as the synchronization master, but unlike other hardware synchronization architectures, it is distributed among all tiles in the manycore. As explained above, the Boot Setup module selects a specific Synchronization Core based on the barrier ID set by the user. In this way, the architecture spreads the synchronization messages all over the manycore. By using this approach, the synchronization master is no longer a network congestion point. Figure2 shows a simplified view of the Synchronization Core. The module is made of three stages: the first stage selects and schedules the Setup and the Account requests. The selected request steps into the second stage, which strips the control information from the message. In the last stage, the stripped request is finally processed. If it is a Setup request, the counter is initialized with the number of involved threads. On the other hand, if the request is an Account message, the counter is decremented by 1. When the counter is 0, all the involved cores have hit the synchronization point, and the master sends a multicast Release message reaching all of them.