Some doubts about the PegasosII G3
  • Acolyte of the Butterfly
    Acolyte of the Butterfly
    BigGun
    Posts: 150 from 2004/6/18
    From: Nagold - Germany
    Quote:


    CISC wrote:
    Quote:

    the G4 memory throughput is limited by MOS atm anyway.


    That's not true



    Its a known fact that the memory throughput of the G4 is limited on MOS.
    I'll give you some numbers to explain this:

    The possible memory throughput of the Pegasos G4 (under Linux) is actually not bad. The Pegasos G4 does achieve the below throughput values under Linux using "good" code.

    reading: 680 MB/sec
    writing: 450 MB/sec
    copying: 650 MB/sec

    These values are in the ranges of an Athlon with DDR memory when using normal C code.

    For comparision a G3 (Pegasos I/II/AmigaOne) is limited to about 200-250MB/sec.

    For many applications the limiting factor is the memory throughput and not the CPU clockrate. (Well, I think we all knew this :)

    So if a MAME game or video plays back slowly then the major reason for this is the memory throughput.

    The problem under MOS is that the cache settting are set in such a way that the possible memory throughput of the G4 is about halved.

    While under MacOS or Linux you can copy about 600-700 MB/sec,
    under MOS 1.4 you are limited to 300-400 MB/sec



    Quote:


    MorphOS has no influence over that (apart from setting default cache-modes for various areas of memory (each with their own up- and downsides))...
    - CISC


    Yes, as you say - its the cache setting that are limiting the memory throughput.
    Frankly, I find the throughput limitation a big disadvantages but I'm exited to learn more. If there are any advantages of the cache setting that outweight this, then please enlighten us.
    I think we are all curious to learn them.

    Cheers
    Gunnar

    [ Edited by BigGun on 2006/7/13 21:15 ]
  • »13.07.06 - 21:12
    Profile Visit Website
  • Acolyte of the Butterfly
    Acolyte of the Butterfly
    BigGun
    Posts: 150 from 2004/6/18
    From: Nagold - Germany
    I would like to explain in more detail the limits of the G3.
    (For those wondering why mplayer runs so slow on G3)

    If you watch a video stream in 800x600 pixel resolution in truecoloern (32bit) then your single frame is about 2MB each.
    If you watch this with 25 frames per second
    then this is reading 25*2 MB and writing 25*2 MB per second.
    Very simple mathemathic :) (okay where is my pocket calculator?)

    Okay here it is : its 100 MB/sec

    Unfortunately the 60x memory bus of the G3 is very limited.
    A typical C program is normally able to copy max 100MB/sec with a G3
    A PPC-ASM optimized program (pulling some cache tricks) manages 200-250 MB/sec

    As you see the 100 MB/sec is very low and thats the reason why Video and big memory apps like Linux X and Games run slow on the G3.

    A G4 has the improved MPX bus protocol.
    A typical C program is normally able to copy about 350-400 MB/sec with a Pegasos G4.
    Using some ASM cache tricks you can increase this to about 650-700 MB/sec.

    (NOTE: These cache tricks DO NOT WORK under MOS 1.4 !! )

    As you can easely see the G4 memory throughput is much higher than the G3. That one of the main reasons why videos and games run faster on the G4.



    Does clockrate matter?

    Quick answer: Normally clockrate matter little.

    We all know that the PPC use a 64bit memory bus.
    That means they can read or write 8 byte at once.
    So if the memory bus would have no waitstates then a G3/G4 would read 8 MByte per Mhz.

    In other words a memory throughput of 100MB/sec equals to 12 Mhz.
    In words: Twelve MegaHerz.

    So if you copy memory with your 600 Mhz G3 it effectively would only need 12 MHz for the 100MB to copy per sec.
    Or in other words you waste 588 Mhz waiting for the bus.

    BTW a good 68k CPU (Cyberstorm) archieves throughput in the range of 150-180 MB. So it should be easy to understand that in rare cases and with "simple" C programs you can still outperfrom a G3 with a 68060.

    As mentioned, the G3 is mainly limited by its bus.
    The G4 has a much better bus protocol achieving a higher throughput.
    Even much higher - if you PPC optimized code and Linux (not MOS1.4 !)

    BTW the AmigaONE operates always in the slow 60x bus mode.
    The reason is the Artecia.
    So on the AmigaONE the G4 will be limited to the same 200MB/range as the G3.

    So if you have the choice between several Pegasos models and you want to run memory intensive applications like Linux then I would highly recommand you to consider a Pegasos G4.

    Cheers
    Gunnar

    [ Edited by BigGun on 2006/7/13 22:17 ]
  • »13.07.06 - 22:13
    Profile Visit Website
  • MorphOS Developer
    bigfoot
    Posts: 508 from 2003/4/11
    Hi,

    Quote:

    So if you copy memory with your 600 Mhz G3 it effectively would only need 12 MHz for the 100MB to copy per sec.
    Or in other words you waste 588 Mhz waiting for the bus.


    Your calculations are slightly off, even with your hypothetical case of zero waitstate, zero latency memory, cache and bus.

    You need one load and one store instruction to copy a chunk of memory. The largest chunk of memory you can load or store in one instruction on a G3 is a double, which is 8 bytes. The 750CX (and probably all other PPC CPUs) has just one load/store unit, which means that you'll need 2 cycles per 8 bytes of memory to copy. So for finding out how many cycles it would take to copy 100MB of memory, the equation looks something like this: 100*1024*1024/(8/2), or in other words, you'd need 26214400 cycles to do this copy.
    I rarely log in to MorphZone which means that I often miss private messages sent on here. If you wish to contact me, please email me at [username]@asgaard.morphos-team.net, where [username] is my username here on MorphZone.
  • »14.07.06 - 02:04
    Profile Visit Website
  • MorphOS Developer
    bigfoot
    Posts: 508 from 2003/4/11
    BTW...

    Quote:

    If you watch a video stream in 800x600 pixel resolution


    Normal video resolution is 720x576 for PAL and 720x480 for NTSC

    Quote:

    in truecoloern (32bit) then your single frame is about 2MB each.


    Unless you're overlay-impaired, frames are stored in YUV on the gfxcard, which is 16bpp. In any case, the source is gonna be 16bpp, so you end up with 16bpp read and 32bpp write in worst case, and 16bpp read and 16bpp write in most cases. So for a PAL DVD (since you're talking about 25FPS), a frame would be 720x576x2 bytes per frame. Or 810kB

    Quote:

    If you watch this with 25 frames per second
    then this is reading 25*2 MB and writing 25*2 MB per second.
    Very simple mathemathic :) (okay where is my pocket calculator?)

    Okay here it is : its 100 MB/sec



    With updated numbers, that's 39MB/s, with half of that going to the PCI bus.
    I rarely log in to MorphZone which means that I often miss private messages sent on here. If you wish to contact me, please email me at [username]@asgaard.morphos-team.net, where [username] is my username here on MorphZone.
  • »14.07.06 - 04:48
    Profile Visit Website
  • MorphOS Developer
    CISC
    Posts: 619 from 2005/8/27
    From: the land with ...
    Quote:

    Its a known fact that the memory throughput of the G4 is limited on MOS.


    It depends what you define as limited.

    Quote:

    The possible memory throughput of the Pegasos G4 (under Linux) is actually not bad. The Pegasos G4 does achieve the below throughput values under Linux using "good" code.

    reading: 680 MB/sec
    writing: 450 MB/sec
    copying: 650 MB/sec


    Your benchmark is obviously broken, how can you achieve 650MB/s copy in a real world test when you can only achieve 450MB/s write?

    Quote:

    The problem under MOS is that the cache settting are set in such a way that the possible memory throughput of the G4 is about halved.


    Only in your highly specialized test.

    Quote:

    Yes, as you say - its the cache setting that are limiting the memory throughput.
    Frankly, I find the throughput limitation a big disadvantages but I'm exited to learn more. If there are any advantages of the cache setting that outweight this, then please enlighten us.


    IIRC the cache-mode used by Linux seriously impaires 32bit access (try benchmarking (preferably with a properly working benchmark) that on both), and this is what is most commonly used in real-world programs, so the choice between gaining a little speed in corner-cases and having decent speed in general was easy.

    The effect of the cache-mode we chose for MorphOS AFAICR is that 64bit access is not that much of a gain over 32bit access on cached areas (but rather on uncached ones like gfxmem), but 32bit access improves so as to meet in the middle with 64bit access at a decent overall speed (which is much more useful in the real world).


    - CISC
  • »14.07.06 - 04:51
    Profile
  • MorphOS Developer
    CISC
    Posts: 619 from 2005/8/27
    From: the land with ...
    Quote:

    Unless you're overlay-impaired, frames are stored in YUV on the gfxcard, which is 16bpp. In any case, the source is gonna be 16bpp, so you end up with 16bpp read and 32bpp write in worst case, and 16bpp read and 16bpp write in most cases.


    Actually, since you're talking about DVDs they are actually normally stored in 4:2:0, which is 12bpp. ;)

    IE, if the player is using the planar YUV output of the decoder and copying that to the gfxcard directly without conversion (current MorphOS MPlayers do convert to 16bpp packed YUV though) you are talking about even less data transferred. ;)


    - CISC
  • »14.07.06 - 05:00
    Profile
  • Acolyte of the Butterfly
    Acolyte of the Butterfly
    BigGun
    Posts: 150 from 2004/6/18
    From: Nagold - Germany
    Quote:


    Quote:

    The possible memory throughput of the Pegasos G4 (under Linux) is actually not bad. The Pegasos G4 does achieve the below throughput values under Linux using "good" code.

    reading: 680 MB/sec
    writing: 450 MB/sec
    copying: 650 MB/sec


    Your benchmark is obviously broken, how can you achieve 650MB/s copy in a real world test when you can only achieve 450MB/s write?




    Please mind that the standard STREAM benchmark
    defined memory throughput as the amount of memory which goes over the bus. I'm using the same definition as STREAM in my examples.

    In case of a copy, the throughput is reading + writing.
    (BTW This should have been obvious if you've looked at the calculation that I gave.)

    Quote:


    Quote:

    The problem under MOS is that the cache settting are set in such a way that the possible memory throughput of the G4 is about halved.


    Only in your highly specialized test.



    READING or COPYING memory is not a specialized test !
    Most serious program nned to do this and are limited by this !

    The fact is that the possible memory throughput under MOS is simply halved. This serious limits programs process memory or copy to the GFX card. (like MAME)

    Quote:


    Quote:

    Yes, as you say - its the cache setting that are limiting the memory throughput.
    Frankly, I find the throughput limitation a big disadvantages but I'm exited to learn more. If there are any advantages of the cache setting that outweight this, then please enlighten us.


    IIRC the cache-mode used by Linux seriously impaires 32bit access (try benchmarking



    The throughput when READING memory or COPYING memory under MOS is limited.
    This is true for all cases 8BIT, 16BIT, 32BIT and 64BIT.


    The 32bit advantage is not true!
    But maybe you mean something different then copying memory?
    Can you please explain or give a real world example? (C or ASM code)


    Quote:


    The effect of the cache-mode we chose for MorphOS AFAICR is that 64bit access is not that much of a gain over 32bit access on cached areas (but rather on uncached ones like gfxmem), but 32bit access improves so as to meet in the middle with 64bit access at a decent overall speed (which is much more useful in the real world).



    Fact is that the possible memory throughput under Linux or MacOS is twice the thoughput under MOS
    This is true for 32bit and 64bit access!


    Its clear that this limitation hurts performance.
    Currently I fail to see a case where the MOS settings are of real advantage.
    But maybe you can be so kind to give a real world example
    to show where the setting of MOS are of advantage? (some source please)

    In normal real world usage szenarios
    as processing and reading main memory
    or copying chunks of memory (in main memory or to GFX Card)
    - Linux and MAC clearly outperform MOS.

    I'm looking forward to learn your example. :)

    Cheers
    Gunnar
  • »14.07.06 - 09:36
    Profile Visit Website
  • MorphOS Developer
    CISC
    Posts: 619 from 2005/8/27
    From: the land with ...
    Quote:

    In case of a copy, the throughput is reading + writing.
    (BTW This should have been obvious if you've looked at the calculation that I gave.)


    You mean the completely wrong calculations (as pointed out by bigfoot (which you completely ignored))?

    Quote:

    The 32bit advantage is not true!
    But maybe you mean something different then copying memory?


    It is, and I mean writing...

    Quote:

    Fact is that the possible memory throughput under Linux or MacOS is twice the thoughput under MOS
    This is true for 32bit and 64bit access!


    Please learn some simple math before you talk about facts.

    Quote:

    Its clear that this limitation hurts performance.


    You mean like with f.ex. MAME where it's faster in MorphOS than Linux?

    Quote:

    But maybe you can be so kind to give a real world example to show where the setting of MOS are of advantage? (some source please)


    No need for sources, just benchmark 32bit write like I told you to in the first place.

    Quote:

    copying chunks of memory (in main memory or to GFX Card) - Linux and MAC clearly outperform MOS.


    Eh, wtf does mac have to do with anything? Besides, you are completely wrong about gfxmem as that's obviously not cached, thus unaffected by any general settings...

    Quote:

    I'm looking forward to learn your example. :)


    I doubt it as you don't seem to really be interested in any real results, you're just trying to push through your opinion (dare I say; "as usual"?), completely ignoring facts like f.ex. those laid out by bigfoot...


    - CISC
  • »14.07.06 - 15:22
    Profile
  • Acolyte of the Butterfly
    Acolyte of the Butterfly
    BigGun
    Posts: 150 from 2004/6/18
    From: Nagold - Germany
    CISC, PLEASE BEHAVE AND STAY ON TOPIC!

    This thread was about the performance differences of G3 and G4.

    All knowledgeable persons here pointed out that the
    G4 has a much higher performance than the G3.

    My humble addition to this point was that the G4 offers a much increased memory throughput - which is one of the main reasons why LINUX runs so much faster on the G4.

    I made the comment that MOS fails to use the full potential of the G4 memory throughput.
    THIS IS SIMPLY A FACT !

    YOU CAN NOT DENY THIS and there is not NEED TO GET PERSONAL HERE

    If there are any good raison why this is the case then please explain them in a good manner!
    We all curious to learn them but please be so kind to behave and stay on topic.

    I only posted facts and gave examples there is no need fo you to get personal here!

    THE FACTS ARE
    Under Linux a Pegasos G4 can read with about 600-700/MB/sec
    Under Linux A Pegasos G4 has a copy throughput of 600-700 MB/sec

    Could you please acknowledge what the limits under MOS are?



    [ Edited by BigGun on 2006/7/14 16:05 ]
  • »14.07.06 - 16:00
    Profile Visit Website
  • Yokemate of Keyboards
    Yokemate of Keyboards
    magnetic
    Posts: 2129 from 2003/3/1
    From: Los Angeles
    Biggun

    I think you need to relax a little bit man. The main problem is you are representing the G3 to be slow, when it is totally suffiecient to run Morphos and Linux for that matter. A Peg2 G3 and Morphos is very nice and fast.

    magnetic
    Pegasos 2 Rev 2B3 w/ Freescale 7447 "G4" @ 1ghz / 1gb Nanya Ram
    Quad Boot: MorphOS 2.7 | Amiga OS4.1 U4 | Ubuntu PPC GNU/Linux | OS X 10.4
  • »14.07.06 - 17:25
    Profile Visit Website
  • Order of the Butterfly
    Order of the Butterfly
    Bladerunner
    Posts: 418 from 2004/2/19
    Biggun:
    Oh no, not that again.
    Because I have no knowledge about all this mem thingy, I wont comment on that.
    However I find your behaviour very rude (Where have I seen that before) and please (as I told Ralf Megabyte allready)
    If *you* want to know something, then use the Term *I* and not *we*. "We" dont want to know,
    (well at least *I* dont want to know, simply because I dont understand it anyway, but otoh
    have other realworld experiences (here MorphOS wins clearly performancewise) )
    *You* want to. So dont behave as a MorphOS User Spokesman!

    And last but not least, arguing that cisc is OT is quite amusing, as he simply gave answers to you.

    To make it short, learn to behave and get some manners
  • »14.07.06 - 17:44
    Profile
  • Order of the Butterfly
    Order of the Butterfly
    tarbos
    Posts: 221 from 2003/4/20
    @Bigfoot

    >The 750CX (and probably all other PPC CPUs) has just one load/store unit

    G5 has two!
    Maybe you should start programming for G5 soon? :)

    @CISC:

    >>I'm looking forward to learn your example. :)

    >I doubt it as you don't seem to really be interested in any real results

    I would also like to learn about the cache mode choices you took for MOS and how they translate into better performance, please.
  • »14.07.06 - 17:57
    Profile
  • MorphOS Developer
    CISC
    Posts: 619 from 2005/8/27
    From: the land with ...
    Quote:

    CISC, PLEASE BEHAVE AND STAY ON TOPIC!


    Uhm, I'm merely replying to your questions, if anything you're the one that's off topic (and not behaving by shouting like mad)...

    Quote:

    All knowledgeable persons here pointed out that the G4 has a much higher performance than the G3.


    I can't see anyone disputing that, unknowledgeable or not.

    Quote:

    I made the comment that MOS fails to use the full potential of the G4 memory throughput.
    THIS IS SIMPLY A FACT !


    No it's not, and please stop shouting .. MorphOS simply behaves in a different way than Linux due to cache-modes, this is not necessarily a limitation when you take into consideration the overall performance of the system.

    Quote:

    YOU CAN NOT DENY THIS and there is not NEED TO GET PERSONAL HERE


    I'm not so much denying as telling you about the differences, what and why they are, something you don't seem to be willing to follow up (oh, and did I mention you could stop shouting now?) .. the only one getting personal here seems to be you (who so aptly keeps on ignoring bigfoot).

    Quote:

    If there are any good raison why this is the case then please explain them in a good manner!


    I believe I already have.

    Quote:

    We all curious to learn them but please be so kind to behave and stay on topic.


    Make up your mind, do you want me to answer your questions, or do you want me to stay on topic?

    Quote:

    I only posted facts and gave examples there is no need fo you to get personal here!


    Again, your so-called facts have been disproven by bigfoot (whom you ignored), merely pointing out that you are wrong is not personal.

    Quote:

    THE FACTS ARE
    Under Linux a Pegasos G4 can read with about 600-700/MB/sec
    Under Linux A Pegasos G4 has a copy throughput of 600-700 MB/sec


    And the relevance would be (btw, shouting don't necessarily make something more factual or relevant)?

    Yes, yes, we know, in the particular cache-mode that Linux is using you can achieve these numbers in highly specialized copyloops, but this tells you absolutely nothing of real-world figures, apart from the fact that this is the kind of performance you can expect to get when doing this kind of copy, however programs in general are not copying massive amounts of data all the time...

    Quote:

    Could you please acknowledge what the limits under MOS are?


    "The sky is the limit"?

    Really though, didn't I explain quite thouroughly the differences already .. did you not read what I wrote or what is your major malfunction?

    Quote:

    @tarbos
    I would also like to learn about the cache mode choices you took for MOS and how they translate into better performance, please.


    Thought I did already, what more do you want to know?


    - CISC

    [ Edited by CISC on 2006/7/14 18:22 ]
  • »14.07.06 - 18:19
    Profile
  • Just looking around
    Posts: 10 from 2003/2/24
    Hi diehardware,

    I'd get a PegII-G4 in your case. Life is too short
    for G3-600 nowadays. Peg-G3-600 was more or less ok
    for MOS & Linux, when it came out but times (and
    software) change.
    Bye!
  • »14.07.06 - 18:20
    Profile
  • Order of the Butterfly
    Order of the Butterfly
    tarbos
    Posts: 221 from 2003/4/20
    >Thought I did already, what more do you want to know?

    One or two comprehensible realworld examples would be nice where MOS' way of doing things is superior.

    The Pegasos seems to have an unusual high RAM latency and I think some tricks like prefetching or longer bursts (where possible) are dearly needed to realise its performance potential.
    Please correct me if this is wrong.

    [ Edited by tarbos on 2006/7/14 18:59 ]
  • »14.07.06 - 18:56
    Profile
  • Priest of the Order of the Butterfly
    Priest of the Order of the Butterfly
    analogkid
    Posts: 659 from 2004/11/3
    From: near myself
    @BigGun:
    Quote:


    The G3 has a memory throughput similar to a K6 or 68040.
    Linux will run on the G3 in comaparable speed.



    do you know what you are talking about?? Linux is on a G3 definitly faster than on 68040...Much faster...
    And about your discussion with CISC: I think he should know what he is talking about, and in my opinion you don't know very much of MorphOS, because if you knew more, you would know, that it is MUCH faster than Linux, it doesn't matter if it runs on a G3 oder G4...
  • »14.07.06 - 19:03
    Profile
  • MorphOS Developer
    CISC
    Posts: 619 from 2005/8/27
    From: the land with ...
    Quote:

    One or two comprehensible realworld examples would be nice where MOS' way of doing things is superior.


    No-one said anything about superior, it's all about compromise .. the cache-mode MorphOS uses gives better 32bit write, but poorer 64bit write (on cached areas only naturally, thus does not apply to gfxmem (here you'll typically see a twofold increase in throughput when using 64bit writes) like BigGun seemed to imply) .. the reason we picked this is because 32bit writes occur all the time, while 64bit writes usually only occur when you are doing optimized copy.

    Quote:

    The Pegasos seems to have an unusual high RAM latency and I think some tricks like prefetching or longer bursts (where possible) are dearly needed to realise the possible performance.
    Please correct me if this is wrong.


    No, you are right.


    - CISC
  • »14.07.06 - 19:05
    Profile
  • Acolyte of the Butterfly
    Acolyte of the Butterfly
    BigGun
    Posts: 150 from 2004/6/18
    From: Nagold - Germany
    CISC, please be so kind and stop misquoting me all the time.

    I never implied 64bit writes only.

    What I said is :
    - for many programs the memory throughput is very important.
    - A G4 has a higher memory throughput then a G3
    - under MOS you can not fully utilize the throughput of a G4
    - The throughput difference Linux/MOS is true for 8,16,32 and 64 bit!

    If you process a memory array in reading mode
    (summing an array up or searching for a string or comparing memory arraus or similar) then you can achieve up to 700MB/sec under Linux. While on MOS you hardly get over 200 MB.

    When you copy chunks of memory or apply some operation to a chunks of memory (like a game, video player, or paint program often do on big chunks of memory) then you can achieve up to 700MB/sec on Linux but only halve on MOS.

    Anallog, I never said Linux in total is faster than MOS.
    Linux is way to complex and has way to much overhead compared to the small MOS. What I said is that Linux is heavy and that it makes a big difference to Linux to use a CPU with increased memory thoughput.

    CISC, it would be nice if you would read more carefully before you post.
    What I said is that the max throughput under MOS is less than under Linux. You claimed this not beeing true - But it simply is true !

    CISC, please be so kind to simply state the fact as they are and stop claiming that
    - copying memory
    - copying images to the gfxcard
    - comparing memory or comparing strings
    - searching in an array.
    - summing up an array
    - applying an filter to an image
    are rare cases, which are hardly ever used or needed in a program.

    If there is a good reason to for the MOS setting - fine.
    It would be nice if you could give examples so that everybody understand the settings but don't twist my words and don't misquote me just to defend the MOS settings.

    Gee, I say Linux and MacOS is better in something than MOS
    (memoy througput in that case) and you react it would be sin to say this.


    One question:
    Is MOS 1.5 using the same cache setting as MOS 1.4 ?

    Gunnar


    [ Edited by BigGun on 2006/7/14 20:24 ]
  • »14.07.06 - 20:05
    Profile Visit Website
  • MorphOS Developer
    CISC
    Posts: 619 from 2005/8/27
    From: the land with ...
    Quote:

    CISC, please be so kind and stop misquoting me all the time.


    Where am I misquoting you (all quotes are pure cut'n'paste)?

    Quote:

    I never implied 64bit writes only.


    I never implied you implied anything.

    Quote:

    - for many programs the memory throughput is very important.


    But you fail to understand the difference between throughput and cache efficiency.

    Quote:

    - A G4 has a higher memory throughput then a G3


    Like I already said, no-one is disputing that.

    Quote:

    - under MOS you can not fully utilize the throughput of a G4


    Not true (only partially when you are talking about 64bit access to cached areas).

    Quote:

    - The throughput difference Linux/MOS is true for 8,16,32 and 64 bit!


    Wrong, and I already told you where to look, yet you're clearly not interested since you are still ignoring it after repeated mention.

    Quote:

    When you copy chunks of memory or apply some operation to a chunks of memory (like a game, video player, or paint program often do on big chunks of memory) then you can achieve up to 700MB/sec on Linux but only halve on MOS.


    Again your math is failing you, besides video players like MPlayer don't copy big chunks of memory, they operate on slices, so as to more likely be in cache by the time it's copied to gfxmem.

    Quote:

    CISC, it would be nice if you would read more carefully before you post.
    What I said is that the max throughput under MOS is less than under Linux. You claimed this not beeing true - But it simply is true !


    It seems it's not my reading skills that are failing...

    Quote:

    CISC, please be so kind to simply state the fact as they are and stop claiming that
    - copying memory
    - copying images to the gfxcard
    - comparing memory or comparing strings
    - searching in an array.
    - summing up an array
    - applying an filter to an image
    are rare cases, which are hardly ever used or needed in a program.


    First off, only the first item applies to your benchmark, the second is identical to MorphOS provided the image data is in cache by the time it's copied (which it is most of the time), and all the rest are totally irrelevant. Secondly I never claimed they were rare cases, I simply said 64bit copyloops were relatively insignificant (and highly localized) compared to the amount of 32bit writes in the life of an OS.

    Quote:

    It would be nice if you could give examples so that everybody understand the settings but don't twist my words and don't misquote me just to defend the MOS settings.


    I did give examples (you ignored them .. repeatedly), I didn't misquote you and I didn't defend the MorphOS settings. Next?

    Quote:

    Gee, I say Linux and MacOS is better in something than MOS
    (memoy througput in that case) and you react it would be sin to say this.


    Gee, I was merely wondering wtf Mac has to do with MorphOS (hw vs sw), Macs have a far better bus than Pegs anyway, and I doubt you have any clue what cache-modes MacOS use...

    Quote:

    Is MOS 1.5 using the same cache setting as MOS 1.4 ?


    Yes.


    - CISC
  • »14.07.06 - 20:56
    Profile
  • Order of the Butterfly
    Order of the Butterfly
    merko
    Posts: 328 from 2003/5/19
    BigGun: You keep asking for sourcecode.. but, *you* are the one making
    big claims here in this thread. So I think *you* are the one who
    should provide some source code showing how non-64 bit operations
    would be faster in Linux than on MOS on the same hardware.
  • »14.07.06 - 22:06
    Profile
  • Acolyte of the Butterfly
    Acolyte of the Butterfly
    BigGun
    Posts: 150 from 2004/6/18
    From: Nagold - Germany
    Quote:


    merko wrote:
    BigGun: You keep asking for sourcecode.. but, *you* are the one making
    big claims here in this thread. So I think *you* are the one who
    should provide some source code showing how non-64 bit operations
    would be faster in Linux than on MOS on the same hardware.



    Hi Merko,

    Proving the memory bandwidth differences between MacOS,Linux, and MOS is easy:
    See here: http://www.greyhound-data.com/gunnar/glibc/

    The bandwidth tests were done for the PPC glibc improvememts that I took part with IBM. Testresults and executables for MOS, MacOS and Linux and source are available on the site too.

    Cheers
  • »15.07.06 - 09:53
    Profile Visit Website
  • Acolyte of the Butterfly
    Acolyte of the Butterfly
    BigGun
    Posts: 150 from 2004/6/18
    From: Nagold - Germany
    Bigfoot,

    before we talk about how much memory bandwidth is needed for playing back a movies (decoding, processing, adding filter and copying to GFX card).

    I ask you one simple question:

    Many Pegasos G3 users complained that they can NOT play fully smoothly play back Movies/DVDs.
    Is the main reason for the the missing CPU clockrate
    or this this rather a memory bandwidth issue?
  • »15.07.06 - 09:57
    Profile Visit Website
  • MorphOS Developer
    CISC
    Posts: 619 from 2005/8/27
    From: the land with ...
    Quote:

    Proving the memory bandwidth differences between MacOS,Linux, and MOS is easy:


    Those benchmarks shows mainly cache efficiency (for the particular code at hand) btw, not "memory throughput" like your page claims.

    Quote:

    Many Pegasos G3 users complained that they can NOT play fully smoothly play back Movies/DVDs.
    Is the main reason for the the missing CPU clockrate
    or this this rather a memory bandwidth issue?


    Neither, and I've discussed this several times here on MZ already (try the search feature), basically this is due to (MorphOS port specific) suckage of MPlayer, combined with DVD-ROM suckage (no internal cache) .. the MorphOS port doesn't have cache2 enabled (and since it uses asyncio for files it generally doesn't need it), which pretty much means that DVDs are loaded&decoded sector-by-sector .. this introduces huge latency in data flow, thus you get stuttering (this is not G3 specific either, it happens on G4 (though less) too even though it's more than fast enough .. try decrypting the DVD to HD and play it from there, I think you'll find that the G3 copes just fine...


    - CISC
  • »15.07.06 - 10:37
    Profile
  • Priest of the Order of the Butterfly
    Priest of the Order of the Butterfly
    Velcro_SP
    Posts: 929 from 2003/7/13
    From: Universe
    The issue with MorphOS mPlayer versions that CISC identifies (cache 2 not enabled leading to sector by sector DVD loading and decoding leading to data flow latency causing DVD stuttering) sounds like something that could be programmed away with some work by the right person. How much work is it, is a bounty the way to go, and how high does the bounty needs to be?

    I would also like the ability to link mPlayer as the default tool for an icon for a movie file, like the Amiga olden days, using iconx or xicon or whatever it is.

    PS: BigGun, it's true about the stuttering w. DVDs on G3, however DIVXs and XVIDs even of medium-large dimensions play okay, IIRC.
    Pegasos2 G3, 512 megs RAM
  • »15.07.06 - 12:01
    Profile
  • Order of the Butterfly
    Order of the Butterfly
    merko
    Posts: 328 from 2003/5/19
    BigGun: Ok, I downloaded the MOS exe, ran it, and compared the results
    against the same results shown in stream_ODW_G4_1000.txt.

    Funnily, the results seem to be almost exactly the same (sometimes a
    tiny bit faster, sometimes a tiny bit slower). So I don't understand
    how you think this would prove that MOS would be crippling memory
    bandwidth somehow.

    And in any case you seem to be missing the point. Even if it would
    somehow be possible to increase memcopy operations by changing the
    cache settings, you can't just magically switch back to some other
    cache setting for the rest of your code. And if your app spends most
    of its time copying memory around, I'd say you probably have done a
    bad job designing the code (or maybe you're using some newbie OO
    code). In the end, it's no improvement if the 2% spent in memcpy in
    a normal app is 100% faster, if this means the other 98% are slowed
    down by 10% (all numbers drawn out of a hat, if anyone has real ones,
    feel free to post).
  • »15.07.06 - 12:58
    Profile