iMac G5 and dnetc benchmark
  • Priest of the Order of the Butterfly
    Priest of the Order of the Butterfly
    koszer
    Posts: 986 from 2004/2/8
    From: Poland
    I was making some benchmarks comparing the available G4 and G5 MorphOS machines while I stumbled at something weird: The dnetc client "Benchmark all" reports values significantly lower for iMac G5 than for a G4 PowerBook 1,67 GHz. Here are the results [in nodes/sec]:


    OGR-NG:
    PowerMac G5 Quad 2,5 GHz: 41083003
    PowerMac G5 Dual Processor 2,7 GHz: 42974426
    iMac G5 2,1 GHz: 33251779
    PowerBook G4 1,67 GHz: 35665372

    RC5-72:
    PowerMac G5 Quad 2,5 GHz: 18861939
    PowerMac G5 Dual Processor 2,7 GHz: 20093102
    iMac G5 2,1 GHz: 15002921
    PowerBook G4 1,67 GHz: 17535177

    Can anyone confirm this behaviour? What could be the cause?
  • »20.05.20 - 11:27
    Profile
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Andreas_Wolf
    Posts: 10932 from 2003/5/22
    From: Germany
    > I stumbled at something weird: The dnetc client "Benchmark all" reports values significantly
    > lower for iMac G5 than for a G4 PowerBook 1,67 GHz. Here are the results [in nodes/sec]: [...]

    Per-clock performance normalized to "1.00 = slowest" is this:

    OGR-NG | RC5-72

    1.04 | 1.06 : PowerMac G5 Quad

    1.01 | 1.04 : PowerMac G5 Dual Processor
    1.00 | 1.00 : iMac G5

    1.35 | 1.47 : PowerBook G4

    As can be seen, your iMac G5 results scale quite well with your PowerMac G5 results, so if the iMac results are weird, the PowerMac results must be weird too.

    > What could be the cause?

    Usually, the per-clock AltiVec performance is significantly lower with the G5 compared to the G4. If you look at the G4 and G5 results of the individual benchmark tests OGR-NG and RC5-72 are composed of, you should be able to see this behaviour with the AltiVec-supporting benchmark tests.

    Edit: Looking at my 2016 dnetc results (G4, G5) for the individual scalar (non-SIMD) RC5-72 tests, it can be seen that the G4 is 15...87% faster per clock than the G5 even with those scalar tests (usually G5 is faster per clock than G4 for dedicated scalar code). It seems that the old hand-crafted PPC assembly code of dnetc was never adapted to the peculiarities of the PPC970 microarchitecture, so that a 60x/7xx microarchitecture is assumed for the scalar tests and a 74xx microarchitecture is assumed for the AltiVec-enabled tests.

    [ Edited by Andreas_Wolf 21.05.2020 - 12:18 ]
  • »20.05.20 - 13:10
    Profile
  • Priest of the Order of the Butterfly
    Priest of the Order of the Butterfly
    koszer
    Posts: 986 from 2004/2/8
    From: Poland
    So basically it's "a dnetc client thing" after all? Because of all the other tests I ran it looks like PowerMac G5 PCIe 2,5 GHz is slightly faster (or slower) than 2,7 GHz Dual Processor (except from Quake III or MPlayer benchmarks, where it beats the AGP machine hands down), and then comes iMac G5 (albeit performing slightly better in 2 of 3 MPlayer tests than AGP G5), then PowerBook G4 1,67 GHz and the iBook G4 1,33 GHz always comes last.

    EDIT: As for the AltiVec computations, here are lame_vmx results:

    PMG5 PCIe: 39,804
    PMG5 AGP: 41,864
    iMac G5: 30,392
    PowerBook: 17,977
    iBook: 15,323

    As you can see, the iMac is 70% faster than PowerBook despite only 25% faster clock. This is what I call a predictable result.

    [ Edited by koszer 20.05.2020 - 15:35 ]
  • »20.05.20 - 14:16
    Profile
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Andreas_Wolf
    Posts: 10932 from 2003/5/22
    From: Germany
    > So basically it's "a dnetc client thing" after all?

    I'd say dnetc and probably many more that's tuned (per compiler switch or assembly optimization) to any of 60x/7xx/74xx microarchitectures, which are closely related (603e -> 740/750 -> 7400), whereas the 970 microarchitecture is a different beast (604e -> POWER3 -> POWER4 -> 970).
    We have this kind of issue on MorphOS also with E-UAE for instance, where a special G5 build has been created because the generic PPC build runs too slow on G5. I guess there's a huge share of MorphOS software affected by this general issue, but it's mostly only noticeable with performance-critical software like benchmark tests or emulators.

    > As for the AltiVec computations, here are lame_vmx results: [...]

    Per-clock performance normalized to "1.00 = slowest" is this:

    1.48 : PMG5 PCIe
    1.44 : PMG5 AGP
    1.34 : iMac G5

    1.00 : PowerBook

    1.07 : iBook

    Interesting. That's not what I would have expected.
  • »20.05.20 - 17:24
    Profile
  • Paladin of the Pegasos
    Paladin of the Pegasos
    Zylesea
    Posts: 1953 from 2003/6/4
    From my - not too systematic - observation the Powerbook is about as fast at thePowerbook. But memory intensive tasks are way faster: Showgirls (not much bandwidth required)is faster on my powerbook, but the iMac G5 does perform better on bandwith intensive Odyssey and MPlayer which makes the update worthwhile.
    --
    http://www.via-altera.de

    Whenever you're sad just remember the world is 4.543 billion years old and you somehow managed to exist at the same time as David Bowie.
    ...and Matthias , my friend - RIP
  • »20.05.20 - 19:33
    Profile Visit Website
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Andreas_Wolf
    Posts: 10932 from 2003/5/22
    From: Germany
    > Showgirls (not much bandwidth required)is faster on my powerbook

    This may very well be a case of ShowCase (the politically correct name of that application since MorphOS 3.10 ;-) being compiled for G4 or earlier. I wonder how this comparison would pan out with a G5-specific executable. Or it may be a case of ShowCase making extensive use of AltiVec, which is usually faster per clock on G4.
  • »20.05.20 - 20:20
    Profile
  • Moderator
    Kronos
    Posts: 1981 from 2003/2/24
    a) the correct name is of course "ShowKäse"

    b) image transformations are what SIMD was kinda made for
    --------------------- May the 4th be with you ------------------
    Mother Russia dance of the Zar, don't you know how lucky you are
  • »20.05.20 - 21:18
    Profile
  • Paladin of the Pegasos
    Paladin of the Pegasos
    Zylesea
    Posts: 1953 from 2003/6/4
    @ Kronos

    Well, the use case defines the name. I watch more girls than "cases". But not what your nasty mind thinks (cough, cough) - I have two little girls a wife at home. Guess who's on most of my photos...


    Altivec rocks with jpeg picture decompression, but a) G5 VMX should be compatible to Altivec hence the SIMD unit should get used, too.
    b) Mplayer benefits from Altivec, too. But Mplayer gives the way better results on the G5.

    I am not saying G4 Altivec per clock may not be faster, but the main difference - so my guess - is the throughput.

    In my eyes, the _major_ benefit from the G5 is not the faster core of the cpu but the faster bus.
    --
    http://www.via-altera.de

    Whenever you're sad just remember the world is 4.543 billion years old and you somehow managed to exist at the same time as David Bowie.
    ...and Matthias , my friend - RIP
  • »20.05.20 - 22:33
    Profile Visit Website
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Andreas_Wolf
    Posts: 10932 from 2003/5/22
    From: Germany
    > G5 VMX should be compatible to Altivec hence the SIMD unit should get used, too.

    Yes, absolutely.

    > Mplayer benefits from Altivec, too. But Mplayer gives the way better results on
    > the G5. I am not saying G4 Altivec per clock may not be faster, but the main
    > difference - so my guess - is the throughput.

    Yes, in case of a video player, way higher memory bandwidth likely outweighs SIMD being a little slower. Besides, slower SIMD per clock may still result in faster SIMD if the clock difference is just big enough.
  • »20.05.20 - 23:28
    Profile
  • Priest of the Order of the Butterfly
    Priest of the Order of the Butterfly
    koszer
    Posts: 986 from 2004/2/8
    From: Poland
    Quote:

    Zylesea wrote:
    b) Mplayer benefits from Altivec, too. But Mplayer gives the way better results on the G5.


    Here are my Mplayer benchmark results for .webm file (1792x1080):

    PowerMac G5 "Quad": 443,328 s
    PowerMac G5 "DP" 2,7 GHz: 545,702 s
    iMac G5: 575,818 s
    PowerBook G4 1,67 GHz: 709,675 s
    iBook G4 1,33 GHz: 864,667 s

    what is more interesting, for a mp4 file (1280x544) results are:

    PowerMac G5 "Quad": 14,023 s
    PowerMac G5 "DP" 2,7 GHz: 24,108 s
    iMac G5: 19,256 s
    PowerBook G4 1,67 GHz: 36,631 s
    iBook G4 1,33 GHz: 43,699 s

    so a big advantage for the PCIe models over the AGP equipped G5.
  • »21.05.20 - 07:23
    Profile
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Andreas_Wolf
    Posts: 10932 from 2003/5/22
    From: Germany
    > Here are my Mplayer benchmark results for .webm file (1792x1080): [...]
    > what is more interesting, for a mp4 file (1280x544) results are: [...]
    > so a big advantage for the PCIe models over the AGP equipped G5.

    Per-clock performance (higher = better) normalized to "1.00 = slowest" is this:

    webm | mp4
    1.33 | 1.86 : PowerMac G5 "Quad"
    1.00 | 1.00 : PowerMac G5 "DP"
    1.22 | 1.61 : iMac G5
    1.25 | 1.07 : PowerBook G4
    1.28 | 1.12 : iBook G4

    (I hope I correctly assume that the video files were read from the RAM disk in any case, so that storage connection speed plays no role.)

    Your observation is certainly correct as in both tests, the slowest is the AGP G5 and the fastest is one of the PCIe G5s, so among the G5s the type of GPU connection seems to be the determining factor for video playing performance.
    Regarding G4 vs. G5, it can be seen that the G4s are found between the two PCIe G5s for high-res WebM video replay, and that the G4s are found between the AGP G5 and the PCIe G5 (but just above the AGP G5) for low-res MP4 video replay. I'd say that there can't be a unidimensional explanation for these G4 vs. G5 per-clock performance results. It's rather that these mixed results stem from factors like SIMD performance, memory bandwidth and microarchitectural code tuning interfering with each other.
  • »21.05.20 - 14:01
    Profile
  • Order of the Butterfly
    Order of the Butterfly
    ernsteiswuerfel
    Posts: 360 from 2015/6/18
    From: Funeralopolis
    Thanks for sharing your mplayer benchmark results!

    AFAIK the PCIe models also got DDR2 RAM vs. DDR1 RAM in the AGP models. Would be interesting if a PowerBook G4 5,8 is faster than a 5,6 one. Both clock at 1,67 GHz but the former has DDR2 RAM.
    Talos II. [Gentoo Linux] | PMac G5 11,2. PMac G4 3,6. PBook G4 5,8. [MorphOS 3.13 / Void Linux / Gentoo Linux] | A1200. ACA-1233, Indivision AGA Mk2. [Amiga OS 3.2]
  • »21.05.20 - 14:33
    Profile
  • Priest of the Order of the Butterfly
    Priest of the Order of the Butterfly
    koszer
    Posts: 986 from 2004/2/8
    From: Poland
    My MPlayer benchmarks were made on a "DDR2" PowerBook. I don't have a 5,6 at hand but I know a guy that's got one. I'll try to contact him and we'll see if the memory type has any meaning here.
  • »21.05.20 - 15:05
    Profile
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Andreas_Wolf
    Posts: 10932 from 2003/5/22
    From: Germany
    > AFAIK the PCIe models also got DDR2 RAM vs. DDR1 RAM
    > in the AGP models.

    Correct. It would be interesting to know whether the MPlayer performance advantage of PCIe/DDR2 G5 over AGP/DDR1 G5 is caused by the faster GPU connection like we've assumed in this thread, or by the (theoretical) doubling of the memory data transfers per clock cycle, or a combination of both.

    > Would be interesting if a PowerBook G4 5,8 is faster than a 5,6 one.
    > Both clock at 1,67 GHz but the former has DDR2 RAM.

    The G4 CPU can only transfer data on one edge of the bus clock signal, so it utilizes DDR1/DDR2 memory as if it was SDR memory. The doubling of the theoretical data rate from SDR to DDR1 and again to DDR2 is completely lost on the G4 CPU. The memory controller of G4 Macs with DDR1/DDR2 memory however is able to transfer data on both edges, so controllers connected to the northbridge may be served on both edges, but the CPU itself cannot.
    Also, you can see with koszer's lame_vmx and MPlayer results that the per-clock performance of the iBook with DDR1 is actually higher (single-digit percentage) than that of the PowerBook with DDR2. I'm not sure why that is (instead of very much same per-clock performance), but it seems to support the doubt of DDR2 trumping DDR1 here.
  • »21.05.20 - 17:28
    Profile
  • Priest of the Order of the Butterfly
    Priest of the Order of the Butterfly
    koszer
    Posts: 986 from 2004/2/8
    From: Poland
    For PowerBook G4 5,6 (1,67 GHz and DDR1 model) the Mplayer results are:

    640x276 mov: 14,282 s
    1280x544 mp4: 35,643 s

    for a quick comparison - PowerBook 5,8 (1,67 GHz and DDR2 model) results are:

    640x276 mov: 15,021 s
    1280x544 mp4: 36,631 s

    So it looks the results are pretty much the same (actually slightly slower for the DDR2 model, but I believe if the test were repeated a few times more we could get the same result).
  • »24.05.20 - 12:14
    Profile
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Andreas_Wolf
    Posts: 10932 from 2003/5/22
    From: Germany
    > For PowerBook G4 5,6 (1,67 GHz and DDR1 model) the Mplayer results are: [...]
    > PowerBook 5,8 (1,67 GHz and DDR2 model) results are: [...]
    > So it looks the results are pretty much the same (actually slightly slower for
    > the DDR2 model [...]).

    Thanks. Pretty much as expected for G4 and in line with your previous DDR1 iBook vs. DDR2 PowerBook results scaled by me.
  • »24.05.20 - 16:01
    Profile