X1000
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Andreas_Wolf
    Posts: 12010 from 2003/5/22
    From: Germany
    > Couldn't you just answer the question?

    That would mean simply quoting what Neko wrote anyway. And if you hadn't ignored my answer from 5 months ago it would have been one step less now ;-)

    > I think you mean running code in big-endian mode.

    Yes, that's what it's all about when we're talking about running m68k code transparently, which is what Jim and I did here.

    > The only time I think code ordering would make a difference is when you
    > want to emulate something (e.g. 68K or PPC)

    Err, that's exactly what this discussion has been about, i.e. running existing m68k code transparently on MorphOS/ARM the same way existing m68k code is transparently running on MorphOS/PPC right now.

    > the emulator would handle that transparently

    I doubt that's possible within a singular address space OS that is based on message passing concept like AmigaOS or MorphOS' ABox. There could be a way around this, though:

    http://morph.zone/modules/newbb_plus/viewtopic.php?topic_id=7771&forum=3&start=15

    But that's an idea laire doesn't seem to like too much:

    "Either you run everything as big endian with some x86 68k emulation and allow "native" code with some butt ugly compiler byte reordering thowing performance away. Some may argue that this might still be faster than existing ppc systems and this might be even true but there are things which are so ugly you just don't do them."
    http://moobunny.dreamhosters.com/cgi/mbmessage.pl/amiga/126048.shtml
  • »29.06.11 - 22:17
    Profile
  • Order of the Butterfly
    Order of the Butterfly
    minator
    Posts: 365 from 2003/3/28
    OK, I did a bit of reading and found a few posts that talk about the processor being able to read big-endian data but other parts of the system will be little endian.

    These are usually promptly followed by "but this doesn't matter to MorphOS".

    If you're moving to a different system you're going to need new drivers for all the hardware anyway so this makes no difference.


    Being a Mac user I did the true big-endian to true little-endian switch some years ago. It went without a hitch, to the end user the switch was completely invisible and I can still run PPC software to this day.
  • »29.06.11 - 22:21
    Profile Visit Website
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Andreas_Wolf
    Posts: 12010 from 2003/5/22
    From: Germany
    > These are usually promptly followed by "but this doesn't matter to MorphOS".

    Whether transparent m68k backwards compatibility matters is a matter of individual perspective I'd say ;-) If MorphOS was going to abandon it anyway it could as well be ported to x86 instead of ARM.

    > If you're moving to a different system you're going to need new
    > drivers for all the hardware anyway so this makes no difference.

    Huh? What have drivers to do with transparent backwards compatibility to m68k applications and libraries?

    > Being a Mac user I did the true big-endian to true little-endian switch some
    > years ago. It went without a hitch, to the end user the switch was completely
    > invisible and I can still run PPC software to this day.

    That's because contrary to AmigaOS or MorphOS' ABox, Mac OS X is not a singular address space OS that is based on message passing concept.

    To clear things up, if you say that ARMv7-A can't operate in true big-endian mode (did I understand you right here?) it seems that I had misread you there:

    https://morph.zone/modules/newbb_plus/viewtopic.php?forum=11&topic_id=6726&start=92
  • »29.06.11 - 22:33
    Profile
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Zylesea
    Posts: 2050 from 2003/6/4
    In my book it is not a question of drivers, but a matter of shared structures and the issue that in this case address and data endianess should be the same.
    I.e. if a big endian application points (BE data format) to 0x00000004 on ARM in BE mode will it get exec base or will it get some arbitrary value, because the address format may still be LE?
    If the ARM chips in question don't offer a "full" big endian environment there's not much benefit over x86 IHMO.
    This question (the exec one) was raised several times already and yet nobody answered it.

    [ Editiert durch Zylesea 30.06.2011 - 00:45 ]
    --
    http://via.bckrs.de

    Whenever you're sad just remember the world is 4.543 billion years old and you somehow managed to exist at the same time as David Bowie.
    ...and Matthias , my friend - RIP
  • »29.06.11 - 22:42
    Profile Visit Website
  • Order of the Butterfly
    Order of the Butterfly
    minator
    Posts: 365 from 2003/3/28
    OK. I wasn't talking about 68K code other than in passing but I see the problem now.
    ...and the solution.

    Quote:

    That's because contrary to AmigaOS or MorphOS' ABox, Mac OS X is not a singular address space OS that is based on message passing concept.


    True but... MorphOS is designed to have separate boxes.
    You could run the existing A-Box in an emulated PPC environment. That in turn runs the 68K stuff.

    You then have another A-Box2 that only runs ARM compiled stuff. Yes the 68K and PPC stuff will suffer but anything native will fly along.

    It also means they can get rid of all those hacks and patches that they added to keep compatibility with the misbehaving Amiga apps. That'd be a much cleaner system albeit still based on a design that's over a quarter of a century old.

    Ideally they'd dump the Amiga API altogether and build the Q-Box but I guess we can't expect miracles :lol:
  • »29.06.11 - 23:01
    Profile Visit Website
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Andreas_Wolf
    Posts: 12010 from 2003/5/22
    From: Germany
    > I wasn't talking about 68K code other than in passing

    But that's what Jim who I replied to with my "true big-endian" question was talking about ;-)

    > I see the problem now.

    Finally ;-)

    > ...and the solution.

    Hear, hear :-)

    > MorphOS is designed to have separate boxes. You could run the existing
    > A-Box in an emulated PPC environment. That in turn runs the 68K stuff.
    > You then have another A-Box2 that only runs ARM compiled stuff. Yes the
    > 68K and PPC stuff will suffer but anything native will fly along.

    Reminds me of this proposal, only with ARM instead of x86:

    https://morph.zone/modules/newbb_plus/viewtopic.php?topic_id=6570&forum=3
  • »29.06.11 - 23:18
    Profile
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Zylesea
    Posts: 2050 from 2003/6/4
    Having two boxes for different ISA was suggested a while ago already (for x86).
    https://morph.zone/modules/newbb_plus/viewtopic.php?topic_id=6570&forum=3#66869
    But who's gonna do it? It is pretty much work.
    --
    http://via.bckrs.de

    Whenever you're sad just remember the world is 4.543 billion years old and you somehow managed to exist at the same time as David Bowie.
    ...and Matthias , my friend - RIP
  • »29.06.11 - 23:18
    Profile Visit Website
  • Jim
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Jim
    Posts: 4977 from 2009/1/28
    From: Delaware, USA
    >Pity. Your statement made it seem like you'd know more in that regard.

    Sorry to disappoint you Andreas. I even neglected to notice that the Cell BE was an in order processor until we started to discuss it (you have much better attention and recall of details then I do - not to mention your superior ability to locate information).

    As far as ARM goes, I only recently began looking at it. This was primarily due to post on this site.ARM does have some features that hold it back (in order again, therefore zero branch prediction, etc).

    As mature RISC architectures go, I still think the PPC is one of the best.

    Right now, pricing and availability of suitable hardware are the main obstacles from continued use of these processors.
    "Never attribute to malice what can more readily explained by incompetence"
  • »30.06.11 - 03:50
    Profile
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Andreas_Wolf
    Posts: 12010 from 2003/5/22
    From: Germany
    > I even neglected to notice that the Cell BE was an in order processor
    > until we started to discuss it

    Really? You mentioned that fact in what I think was your very first posting here on MorphZone:

    https://morph.zone/modules/newbb_plus/viewtopic.php?topic_id=6218&forum=11

    I'm not aware that we were in contact before that.

    > ARM does have some features that hold it back (in order again, therefore
    > zero branch prediction, etc).

    Cortex-A9 and Cortex-A15 have out-of-order execution as well as speculative execution.
  • »30.06.11 - 04:05
    Profile
  • Jim
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Jim
    Posts: 4977 from 2009/1/28
    From: Delaware, USA
    https://morph.zone/modules/newbb_plus/viewtopic.php?topic_id=6218&forum=11

    Wow, now I know how minator must feel. You recalled that? You MUST be a cyborg.
    I guess I just wasn't as aware of big a performance hit in-order execution creates (until you posted some comparison benchmarks).
    "Never attribute to malice what can more readily explained by incompetence"
  • »30.06.11 - 14:29
    Profile
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Andreas_Wolf
    Posts: 12010 from 2003/5/22
    From: Germany
    > You recalled that?

    No. I just didn't recall that it was our discussion that made you aware of Cell being in-order. So I fed the MorphZone search facility with the terms "cell" and "in-order" and guess what it came up with: it lists your very first MorphZone posting as the oldest MorphZone posting containing both those terms.

    > You MUST be a cyborg.

    This I am as much as I am a prophet ;-)

    > I guess I just wasn't as aware of big a performance hit in-order
    > execution creates (until you posted some comparison benchmarks).

    https://morph.zone/modules/newbb_plus/viewtopic.php?forum=3&topic_id=6993&start=66

    You mean this?
  • »30.06.11 - 14:46
    Profile
  • Order of the Butterfly
    Order of the Butterfly
    minator
    Posts: 365 from 2003/3/28
    Quote:

    ARM does have some features that hold it back (in order again, therefore zero branch prediction, etc).


    Why do you think in-order processors have no branch prediction?
    The ARM10 had it over a decade ago and that was an in-order design.

    Also, in-order does not automatically mean slow. It really depends what you are running. Things like Cell and GPUs are all in-order but they're very fast processors.
    In-order becomes a problem if you are trying to run "control" code at high speed. Things like "data" processing heavy (think video, image, audio processing and even games) get relatively little advantage from out-of-order, so processors focused on data processing only tend to be in-order or at best only mildly out-of-order.

    e.g. The new Freescale parts designed for networking appear to be "mildly" out-of-order, nothing compared to even a G5.
  • »30.06.11 - 19:39
    Profile Visit Website
  • Jim
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Jim
    Posts: 4977 from 2009/1/28
    From: Delaware, USA
    Prophet? No. Cybernetic? Not really.
    But truly your ARE the king of the search engines.

    And yes, I'd never realized how much of a role out of order execution and branch prediction played in modern CPUs until you quoted those figures.

    I really had much higher expectations of the Cell BE's performance before that.

    It also points to one of the reasons most PPCs have an inherent advantage over ARM CPUs.
    "Never attribute to malice what can more readily explained by incompetence"
  • »01.07.11 - 00:05
    Profile
  • Jim
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Jim
    Posts: 4977 from 2009/1/28
    From: Delaware, USA
    >Things like Cell and GPUs are all in-order but they're very fast processors.

    Actually, I used to think that too. Until we spent some months discussing it (and even making some inquiries to IBM).
    You see, a 3.2 Ghz Cell BE IS fast. But how powerful is it?
    You need to look at the benches that Andreas re-posted the link to.
    Per clock cycle, an out of order PPC is much more powerful then the Cell's core processor.
    Trust me, this has been thoroughly hashed over. Many of us had high expectations for the Cell's performance.
    But the PPE core turned out to be somewhat weak and programming the SPEs for maximum efficiency was about as easy as juggling running chainsaws.
    "Never attribute to malice what can more readily explained by incompetence"
  • »01.07.11 - 00:15
    Profile
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Andreas_Wolf
    Posts: 12010 from 2003/5/22
    From: Germany
    > I'd never realized how much of a role out of order execution
    > and branch prediction played in modern CPUs until you quoted
    > those figures.

    The Cell PPE while being an in-order design does have branch prediction. See minator's reply where he says that out-of-order execution capability is not a prerequisite for branch prediction.

    > I really had much higher expectations of the Cell BE's
    > performance before that.

    The Cell's special performance comes from the SPEs, if used properly that is. The benchmark figures I quoted take only the PPE into account, the SPEs are ignored.

    > It also points to one of the reasons most PPCs have an
    > inherent advantage over ARM CPUs.

    As it seems you missed what minator and I tried to tell you: there have been in-order ARM CPUs with branch prediction for at least a decade, and since early 2010 there're also out-of-order ARM CPUs. So what's this reason and advantage you're talking about?
  • »01.07.11 - 00:29
    Profile
  • Order of the Butterfly
    Order of the Butterfly
    minator
    Posts: 365 from 2003/3/28
    [email]Per clock cycle, an out of order PPC is much more powerful then the Cell's core processor.
    Trust me, this has been thoroughly hashed over. Many of us had high expectations for the Cell's performance.[/email]

    That was sort of my point. It's weak(ish) at control code. But you'll find it'll do rather well on AltiVec stuff, probably better than any other PowerPC.

    In reality the PPE's performance was quite varied. I've seen benchmarks for it that showed it to be both out-gunning and out-gunned by a G4.
    It's designed for throughput so you really need to program it directly for that. A lot of existing code wont do that very well.

    Quote:

    But the PPE core turned out to be somewhat weak and programming the SPEs for maximum efficiency was about as easy as juggling running chainsaws.


    That's a load of twaddle.

    Most of the people who say Cell is difficult to program have never actually programmed it.

    I spent a lot of time working with a start-up that was going to build Cell workstations. I read *everything* there was on Cell and there was a lot to read, most of which completely missed by the press. Of the people that actually programmed Cell, pretty much no one said it was difficult, more involved yes but not difficult.

    Later on I got myself a PS3 and programmed it myself. Getting the code to run was complicated but that was because IBM's SDK was badly laid out. The actual coding was copy-paste of some AltiVec code and some added triple buffering. Not exactly rocked science.

    As far as I can tell, the whole "Cell is difficult to program" was a FUD campaign by IBM's competitors.
  • »01.07.11 - 00:48
    Profile Visit Website
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Andreas_Wolf
    Posts: 12010 from 2003/5/22
    From: Germany
    > you'll find it'll do rather well on AltiVec stuff, probably
    > better than any other PowerPC.

    I can imagine that in this regard both the in-order POWER6 and the out-of-order POWER7 surpass the Cell PPE (as well as the Xenon/XCGPU for that matter).

    > the whole "Cell is difficult to program" was a FUD campaign
    > by IBM's competitors.

    I think that also the competitors of IBM's customer/partner Sony, who are IBM customers as well (but for other products), may have had good reasons to take part in such a campaign ;-)
  • »01.07.11 - 01:05
    Profile
  • Jim
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Jim
    Posts: 4977 from 2009/1/28
    From: Delaware, USA
    Actually you've both made good points.
    But I have examined IBM's programming tutorials for maximizing the potential of the SPEs and I'd still say it resembles a juggling act.
    And trying to compare a Cell to a standard PPC becomes difficult as the efficient use/programming of the SPEs is essential to harnessing the Cell's potential.
    At least with GPUs there are some programming tools/standards.
    Are there any software tools to help code for and manage Cell use?
    "Never attribute to malice what can more readily explained by incompetence"
  • »01.07.11 - 01:15
    Profile
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Andreas_Wolf
    Posts: 12010 from 2003/5/22
    From: Germany
    > Are there any software tools to help code for and manage Cell use?

    http://www.alphaworks.ibm.com/keywords/cell
  • »01.07.11 - 02:11
    Profile
  • Jim
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Jim
    Posts: 4977 from 2009/1/28
    From: Delaware, USA
    CellAttach looks interesting, but its for the PowerXCell8i processor.
    The rest of the software listed is pretty much the same stuff they had the last time I looked.
    Nothing that really integrates and coordinates SPE programming with the PPE.

    Have you looked at the programming examples? They're painful.
    "Never attribute to malice what can more readily explained by incompetence"
  • »01.07.11 - 03:02
    Profile
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Andreas_Wolf
    Posts: 12010 from 2003/5/22
    From: Germany
    > Nothing that really integrates and coordinates SPE programming with the PPE.

    I'm not sure I even know what you mean here. I guess if such software tools exist then minator who I regard as an expert in all things Cell can help you better than I ever could.

    > Have you looked at the programming examples?

    I had a quick glance at the IBM articles I referred you to some months ago but my programming experience is too shallow for a meaningful assessment as to whether properly programming the Cell including its SPEs is just "more involved" or really resembles "painful chainsaw juggling".
  • »01.07.11 - 10:11
    Profile
  • Order of the Butterfly
    Order of the Butterfly
    minator
    Posts: 365 from 2003/3/28
    Quote:

    At least with GPUs there are some programming tools/standards.
    Are there any software tools to help code for and manage Cell use?


    The same one: OpenCL
  • »01.07.11 - 19:07
    Profile Visit Website
  • Order of the Butterfly
    Order of the Butterfly
    minator
    Posts: 365 from 2003/3/28
    Quote:

    > you'll find it'll do rather well on AltiVec stuff, probably
    > better than any other PowerPC.

    I can imagine that in this regard both the in-order POWER6 and the out-of-order POWER7 surpass the Cell PPE (as well as the Xenon/XCGPU for that matter).


    They're POWER, not PowerPC ;-)
  • »01.07.11 - 19:09
    Profile Visit Website
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Andreas_Wolf
    Posts: 12010 from 2003/5/22
    From: Germany
    > They're POWER, not PowerPC ;-)

    https://morph.zone/modules/newbb_plus/viewtopic.php?forum=3&topic_id=7289&start=40 ;-)
  • »01.07.11 - 19:51
    Profile
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Andreas_Wolf
    Posts: 12010 from 2003/5/22
    From: Germany
    > The same one: OpenCL

    Interesting. Found this:

    "The framework allows for parallel programming across a number of devices such as CPUs, GPUs, and accelerators (like the Cell/B.E. SPU)."
    http://www.alphaworks.ibm.com/tech/opencl

    Seeing as Jim has expressed interest in OpenCL before this might be exactly what he's looking for.


    Edit: Found a project to create an open source implementation:

    http://sites.google.com/site/openclps3/

    [ Edited by Andreas_Wolf 01.07.2011 - 23:19 ]
  • »01.07.11 - 20:00
    Profile