Question re. tRTP

nnimrod · July 29, 2021

Context is Z97, giga SOC Force motherboard. Micron D9KPT memories.

If I understand correctly, a value for tRTP of less than (tCAS+tBurst) - tRP should be meaningless, right? Since the page can't be closed until it's finished with tBurst, even if it's finished precharging.

In my example case, I have

tCAS 5

tRP 5

tRTP 3 and 4

tBurst is always 4 (Since a 4 bit burst still takes 4 clock cycles)

Some SuperPi and AIDA64 results: (666mhz memory is why it's 7 minutes ~46 seconds)

tRTP 3
1	465.828
2	466.015
3	466.110		best half avg	466.091
4	466.140		best half range	0.422
5	466.203		best half stdev	0.152
6	466.250		read	write	copy
7	466.343		20849	22328	19590
8	466.344		20841	22336	19433
9	466.344		20837	22328	19413
10	466.375		20856	22324	19425
11	466.453		20875	22335	19435
12	467.891	NEIR 17	20852	22330	19459	average

tRTP 4
1	465.594
2	465.797
3	465.890		best half avg.	465.966
4	466.141		best half range	0.625
5	466.156		best half stdev	0.246
6	466.219
7	466.265		20785	22343	19428
8	466.328	NEIR 19, 14, NCIS 7	20822	22316	19452
9	466.453		20838	22310	19431
10	466.484		20851	22331	19452
11	466.860		20789	22336	19442
12	467.015	NEIR 3	20817	22327	19441	average

In 32m, tRTP 4 is generally better, although less stable (4 fails vs 1 fail for tRTP 3). In AIDA tRTP 3 has unambiguosly better read performance, and by extension, a little better copy.

I don't know why tighter tRTP is slower in 32m
I don't know why tighter tRTP is more stable in 32m
I don't know why tighter tRTP has better performance in AIDA
Tighter tRTP shouldn't matter at all because page closure is waiting on data transfer (tBurst), not tRP, right?

Is this just random error throwing me for a loop? Do you think increasing sample size will make this inconsistency go away?

alatron978 · July 30, 2021

No, it does not have to be. Firstly I'll explain the prefetch architecture, then a read to precharge scenario with a single burst.

Many DDR memory systems use prefetching technology to reduce the internal memory clock while still allowing for high transfer rates. The prefetch architecture uses an internal memory bus that is wider than the I/O bus by however many times the prefetch architecture used is. On DDR3 and DDR4 and 8n prefetch architecture is used, this means that internal memory bus is 8 times wider than the external I/O bus.

The prefetch architecture works by having the data stored transferred from the internal core memory into prefetch buffers for reads and the data transferred from the prefetch buffers to the internal memory for reads. It takes a single internal memory clock cycle to transfer this data both ways, meaning that when the read command is addressed, after 4 I/O bus clock cycles the data will be in the prefetch buffers, and ready to transfer. Due to this having a CAS latency below 4 on DDR3 is not possible.

This means that the DRAM is free to be precharged just 4 clock cycles after the read command is addressed, even though the burst has not occurred yet.

Now I'll explain a single read burst to a precharge.

This is a diagram of a read to precharge scenario I made my that applies to both DDR4 and DDR3 memory systems. This is a hypothetical situation where CL = 16, tRCD = 16, tRP = 16 and tRTP = 4, these timings are all legal and viable.

So in this scenario the memory is firstly activated, which opens the row that is going to be read from, then tRCD clock cycles later, the read command is addressed which chooses the column to read from, and then starts the internal data transfer from the internal memory to the prefetch buffer. As DDR3 is 8n, the internal memory bus is 8 times wider, and as DDR3 is well DDR, the internal memory clock is 4 times slower then the physical I/O bus clock. This means that the data is transferred to the external prefetch buffer just 4 I/O clock cycles after. This means after this point, the memory can be precharged when ever.

When the read command is addressed you can then see tRTP is the read to precharge delay, whilst CL is the delay to the start of the burst, both commands starting simultaneously but not caring about the other.

tRP is then the recovery from the precharge to when the memory can be activated or refreshed again.

So the min value for tRTP is just 4, not (CL+BL) - tRP, CL and BL don't even need to be accounted for since they go down a different path, and tRP happens when tRTP is expired.

I hope this helps, I'm happy to answer any other questions you might have

nnimrod · July 30, 2021

Thanks for making an account to reply! I'm not sure I understand correctly yet tho.

Why do we have to wait tCAS clocks to start the burst if the memory is already in the buffer 4 clocks after the read command was processed?
And why does this motherboard allow me to set tRTP to 3? Setting it to 3 isn't a meaningless change, it changes performance, and it shows up in the gigabyte software that shows memory timings in OS. Is it setting tRTP to 3 only in cases where the burst has been chopped to 4 bits?

Do you happen to work in the field?

TerraRaptor · July 30, 2021

5 hours ago, nnimrod said:

Setting it to 3 isn't a meaningless change, it changes performance

Data in your table displays just a normal variance imho. I think after 100 runs both datasets will average to the same values.

nnimrod · July 30, 2021

6 hours ago, TerraRaptor said:

Data in your table displays just a normal variance imho. I think after 100 runs both datasets will average to the same values.

I fear you might be correct. But surely I'm not alone in my reluctance to run 32m 200 times to know for sure if one secondary timing is faster...

Confidence in results would be better if I could reduce variance. And I had much better variance until I tightened to cas 5. Previously at cas 6 I was down to about .2 or less variance between the best half of 12 runs. Cas 5 brought better best runs, and worse worst runs. I hope that I can get the variance down again by going back over seconds/terts.

One of the important terts was skewing tRDRD_dr/_dd to 5/6. tWR 9 was also very important.

Sign In

Question re. tRTP

Recommended Posts

nnimrod

Link to comment

Share on other sites

alatron978

Link to comment

Share on other sites

nnimrod

Link to comment

Share on other sites

TerraRaptor

Link to comment

Share on other sites

nnimrod

Link to comment

Share on other sites

Join the conversation

HWBOT

Browse

Activity