How to accurately simulate the concurrent data transfer characteristics of multi-channel DDR?

I tried two approaches to utilize multiple channels. 

(1) The first was simply setting `controllers=4` in the `cfg` file.
```python
mem = {
    controllers = 4;
    type = "DDR";
    ranksPerChannel = 4;
    banksPerRank = 8;
    tech="DDR4-3200-CL22";
  };
```
Compared to `controllers=1`, there was almost no significant difference. The metric I focus on is the IPC corresponding to the CPU in the output file `zsim.out`.
```python
mem = {
    controllers = 1;
    type = "DDR";
    ranksPerChannel = 4;
    banksPerRank = 8;
    tech="DDR4-3200-CL22";
  };
```

(2) The second approach draws inspiration from the implementation of `banshee`[`https://github.com/yxymit/banshee`]. It involves creating four DDR channels in an array-like structure to form a multi-channel DDR (`mcdram`). Memory requests are then distributed across channels by performing a modulo operation based on the number of channels to determine which channel handles each request.
```cpp
 _mcdram = (MemObject **) gm_malloc(sizeof(MemObject *) * _mcdram_per_mc);
for (uint32_t i = 0; i < _mcdram_per_mc; i++) {
	g_string mcdram_name = _name + g_string("-mc-") + g_string(to_string(i).c_str());
    	// ...
        } else if (_mcdram_type == "DDR") {
	// XXX HACK tBL for mcdram is 1, so for data access, should multiply by 2, for tad access, should multiply by 3. 
        	_mcdram[i] = BuildDDRMemory(config, frequency, domain, mcdram_name, "sys.mem.mcdram.", 1, timing_scale);
	}//....
```
```cpp
Address address = req.lineAddr;
uint32_t mcdram_select = (address / 64) % _mcdram_per_mc;
Address mc_address = (address / 64 / _mcdram_per_mc * 64) | (address % 64);
//...
if (_scheme == CacheOnly) {
	req.lineAddr = mc_address;
 	req.cycle = _mcdram[mcdram_select]->access(req, 0, 4);
	req.lineAddr = address;
	_numLoadHit.inc();
	futex_unlock(&_lock);
	return req.cycle;
}
//...
```
Unfortunately, I still observed almost identical performance (IPC) compared to the pure DDR setup with controllers=1.

To gain a deeper understanding of this issue, I referred to several past issues. For instance, I experimented with modifying tCK to increase bandwidth and adjusting tBL. While these changes had some effect, the improvements were not significant. I also examined the `zsim-ndp`[`https://github.com/CriusT/zsim-ndp`] implementation of MemChannel[`https://github.com/CriusT/zsim-ndp/blob/master/src/mem_channel.cpp`], but encountered similar performance challenges. I have also tried modifying the memory interleaving approach, but the results were still not good.

I added debugging information in the `**trySchedule**` function of `**ddr_mem.cpp**`. By comparing the debug output, I found that the two aforementioned methods for constructing multi-channel DDR systems exhibited almost identical `r->arrivalCycle` sequences. When timing parameters such as tBL were modified, only numerical differences appeared, but the pattern remained largely consistent.

```cpp
uint64_t DDRMemory::trySchedule(uint64_t curCycle, uint64_t sysCycle) {
//...
std::cout << curCycle << " Found ready request 0x" <<  r->addr << "   r->arrCycle= " << r->arrivalCycle << std::endl;
//...
}
```

I encountered a similar issue when using `gem5` Simulator. This raises the question: are these discrete-event-driven simulators inherently limited in accurately simulating the parallelism achievable with multi-channel memory systems, particularly their ability to exploit high bandwidth through concurrency?

**Thank you for any useful suggestions!**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to accurately simulate the concurrent data transfer characteristics of multi-channel DDR? #274

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

How to accurately simulate the concurrent data transfer characteristics of multi-channel DDR? #274

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions