ARM for the future?
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Andreas_Wolf
    Posts: 12199 from 2003/5/22
    From: Germany
    > AFAIK most ARM CPU's are bi-endian. At least I think that's the case with ARMv7/Cortex

    After half a decade of assumptions, speculation and guesswork here on MorphZone, I think I found the definite answer on the question of ARM's endianness using these references:

    http://translatedcode.wordpress.com/2012/04/02/this-end-up/
    http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0203g/Chddgffb.html

    The gist of it is that ARM knows one little-endian mode (on both data and code level, aka true little-endian) and two big-endian modes: BE32 and BE8. BE32 is big endianness on both data and code level (aka true big-endian), whereas BE8 is data-only big endianness.

    BE32 is supported by:
    - ARMv4
    - ARMv5
    - ARMv6

    BE8 is supported by:
    - ARMv6
    - ARMv7
    - ARMv8
  • »21.02.14 - 13:21
    Profile
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Zylesea
    Posts: 2057 from 2003/6/4
    Quote:

    Andreas_Wolf schrieb:
    > AFAIK most ARM CPU's are bi-endian. At least I think that's the case with ARMv7/Cortex

    After half a decade of assumptions, speculation and guesswork here on MorphZone, I think I found the definite answer on the question of ARM's endianness using these references:

    http://translatedcode.wordpress.com/2012/04/02/this-end-up/
    http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0203g/Chddgffb.html

    The gist of it is that ARM knows one little-endian mode (on both data and code level, aka true little-endian) and two big-endian modes: BE32 and BE8. BE32 is big endianness on both data and code level (aka true big-endian), whereas BE8 is data-only big endianness.

    BE32 is supported by:
    - ARMv4
    - ARMv5
    - ARMv6

    BE8 is supported by:
    - ARMv6
    - ARMv7
    - ARMv8


    Andreas, you're the man! Good find. I almost gave up about that issue.
    The beefier ARMs hence don't have the benefit of an easier to accomplish binary compability.

    The Raspberry pi however is ARMv6 based....
    --
    http://via.bckrs.de

    Whenever you're sad just remember the world is 4.543 billion years old and you somehow managed to exist at the same time as David Bowie.
    ...and Matthias , my friend - RIP
  • »21.02.14 - 21:37
    Profile Visit Website
  • Moderator
    Kronos
    Posts: 2334 from 2003/2/24
    Quote:

    Zylesea wrote:

    The beefier ARMs hence don't have the benefit of an easier to accomplish binary compability.





    Don't really see the problem here (or I failed to understand the linked text).

    BE8 would still allow all system-structures to be 1:1 copies of 68k/PPC ones.

    All data handled by an app internally can also be BE8 without problem.

    The only thing we would need is an elf-loader making sure it load code-segments as LE and data-segments as BE (which I assume would be trivial).

    Same for an 68k or PPC EMU, all data is BE and so is all legacy code (which is just data for the EMU), only the snippets of ARM-code it wants executed have to loaded a LE.
  • »21.02.14 - 22:13
    Profile
  • Jim
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Jim
    Posts: 4977 from 2009/1/28
    From: Delaware, USA
    It is an additional complication, but less complicated than an X86 jump would be.
    Still, couldn't we just stay with our current ISA?
    "Never attribute to malice what can more readily explained by incompetence"
  • »21.02.14 - 23:50
    Profile
  • Priest of the Order of the Butterfly
    Priest of the Order of the Butterfly
    ausPPC
    Posts: 543 from 2007/8/6
    From: Pending...
    Some other thoughts on ARM on a different project - http://reactos.org/node/779
    PPC assembly ain't so bad... ;)
  • »01.03.14 - 03:02
    Profile Visit Website
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Andreas_Wolf
    Posts: 12199 from 2003/5/22
    From: Germany
    > http://reactos.org/node/779

    From there:
    "Apple, NVidia, Qualcomm, and Samsung are only some of the companies that have developed custom processors implement the ARM ISA. These aren't just designs that combine a generic ARM processor with a bunch of peripherals like graphics and memory, these are basically genuine custom CPU designs where groups experiment with pipelining, branch prediction, and even layout to maximize performance and minimize power usage."

    For Apple and Qualcomm this is true, but chips with Nvidia's custom ARM core (Denver) have yet to emerge, and Samsung hasn't even announced any custom ARM core yet.
  • »01.03.14 - 12:49
    Profile
  • Order of the Butterfly
    Order of the Butterfly
    minator
    Posts: 370 from 2003/3/28
    I don't think Freescale made MIPS chips but they are/were big in networking so would have been a big competitor.
  • »03.03.14 - 23:13
    Profile Visit Website
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Andreas_Wolf
    Posts: 12199 from 2003/5/22
    From: Germany
    > I don't think Freescale made MIPS chips but they are/were big in networking so
    > would have been a big competitor.

    Thanks, so it's more like he's meaning "the market(s) Freescale is in" when he says just "Freescale". That'd make sense at least.
  • »03.03.14 - 23:23
    Profile
  • Acolyte of the Butterfly
    Acolyte of the Butterfly
    KimmoK
    Posts: 102 from 2003/5/19
    From what I have learned, MIPS is not going away anytime soon.
    Companies like Cavium seem to keep MIPS still on their high performance branch while developing also ARM core based chips.

    It seems similar with Freescale. There exist higher performing Power core chips on the roadmap than the ARM ones.

    But only future tells...
    :-x :-P 8-)
  • »04.03.14 - 13:19
    Profile
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Andreas_Wolf
    Posts: 12199 from 2003/5/22
    From: Germany
    > Companies like Cavium seem to keep MIPS still on their high performance branch
    > while developing also ARM core based chips.

    ...and even their own ARM cores:

    https://morph.zone/modules/newbb_plus/viewtopic.php?forum=3&topic_id=7675&start=300
  • »04.03.14 - 19:21
    Profile
  • Order of the Butterfly
    Order of the Butterfly
    minator
    Posts: 370 from 2003/3/28
    I'm surprised this hasn't appeared here yet:

    apples-cyclone-microarchitecture-detailed

    It's as suspected an absolute monster. It's a very wide OOO machine with 6 instruction decode per cycle.
    Clock for clock it's as fast as anything Intel make. Pretty astonishing given that's Apple's first 64 bit core design.

    I wanna see what they come up with next!
  • »05.04.14 - 16:39
    Profile Visit Website
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Andreas_Wolf
    Posts: 12199 from 2003/5/22
    From: Germany
    > I'm surprised this hasn't appeared here yet:
    > http://www.anandtech.com/show/7910/apples-cyclone-microarchitecture-detailed

    I read that some days ago. Unfortunately the article doesn't provide a DMIPS figure for the core. It would be nice to have one (also for the X-Gene) for my list :-)
  • »05.04.14 - 19:45
    Profile
  • Priest of the Order of the Butterfly
    Priest of the Order of the Butterfly
    ausPPC
    Posts: 543 from 2007/8/6
    From: Pending...
    http://www.bunniestudios.com/ Open source laptop.
    PPC assembly ain't so bad... ;)
  • »05.04.14 - 21:30
    Profile Visit Website
  • Caterpillar
    Caterpillar
    JuLieN
    Posts: 35 from 2008/4/16
    From: France
    On Talkchess forum we have a great test to benchmark ARM cpus : the Stockfish chess engine. It's perfect because :
    - it uses all the cores you feed it with,
    - it is available for Android and iOS
    - chess programs love 64 bits, so you clearly see the difference between 32 and 64 bits CPUs.

    We measure the average NPS Stockfish displays (NPS : Nodes Per Second : how many nodes in the chess tree Stockfish travels through pers second.

    Here's our current charts (most powerful devices on top) :

    Quote:

    Ipad Air (2 64bit cores 1.4 ghz) 1.218.000 new highscore
    Iphone 5s (2 64bit cores 1.3 ghz) 1.152.000 new
    iPad Mini 2 (2 64bit cores 1.3 GHz) 1.128.000
    Zopo ZP998 (8 cores) 1.100.000
    Star Ulefone U9592 (8 cores 1.6 Ghz) 1.080.000
    ASUS Transformer TF701t (4 cores ) 1.054.000 <- new
    Samsung Note10.1 2014 (4cores exynos) 1.014.000 <- new
    Samsung Galaxy Note2 (4 cores) 714.000
    ASUS Transformer TF300T (4 cores) 661.000
    Sony Xperia Z1 (4 cores) 602.000
    Asus Infinity tf700 (4 cores) 590.000 new
    Note 3 (4 cores) 581.000 new
    Galaxy S3 (4 cores) 574.000
    Cubot One (4 core MTK 6589T@ 4x1.5Ghz) 568.000 new
    LG G2 556.000 new
    Pomp c6 (4 cores MT6589T) 553.000
    Pipo max M9 (4x cortex a9 1.6 ghz) 526.000
    LG Optimus 4xHD (4 cores) 521.000
    ZTE geek v975 (2+2HT intel atom) 511.000 new
    iPad4 510.000
    Nexus 10 (2 cores) 510.000 new
    Google Nexus 5 (4 cores) 508.000 new
    Amoi N828 (4 cores MTK6589 @ 4x1,2GHz) 493.000
    iPhone 5 (2 cores) 476.800
    HTC one X tegra3 (4 cores) 470.000
    Motorola G (4 cores) 460.000 new
    google nexus 7 tablet pc (4 cores) 457.000
    SamsungGalaxy S4(1.9Ghz,4 cores,1GBhash)450.000 new
    Asus Transformer Prime (4 threads) 440.435
    HTC one 438.000 new
    Huawei P6 (Hi-SiliconK3V2 4x1.5) 405.000 new
    Google Nexus4 (4 cores) 400.000
    Xiaomi Mi2 386.000
    Sony Xperia Z (4 cores) 376.000
    Oppo Find 5 368.000
    Samsung Galaxy 3 (2 core) 333.567
    Galaxy Note 329.000
    Trekstor Surftab 305.000
    iPad mini 277.500
    iPad3 277.000
    Ipad2 (2 cores) 274.000
    Motorola Razr i (2 cores hyperthread) 271.000
    Sony xperia S (2 cores) 259.000
    Tablet Rockchip RK3066 2core @1.6Ghz 250.000
    samsung galaxy S2 253.000
    Motorola X 253.000 new
    Motorola Razr (Slimkat Android 4.4.2) 250.000
    iPhone 4S (Apple A5) 225.500
    Lenovo A660 (2 cores) 222.000
    Advent Vega Tegra2 220.500
    Sony Tablet S 218.000
    Motorola Razr xt910 (2 cores) 216.000
    Motorola Razr HD 212.000
    LG Optimus Speed P990 (2 cores) 200.000
    HTC one S qualcomm s4 (2 cores) 200.000
    Sony Xperia P (2 cores) 196.000
    Samsung Galaxy SIII 1.5GHz (2 Cores) 189.867
    Motorola Razr i (1 core) 186.000
    Nuu Nu1 (qualcom dualcore 1.7 Ghz) 142.000
    samsung epic 141.000
    HTC Flyer, 1.5 Ghz 140.000
    Advent Vega Tablet (2 cores) 137.126
    Dell Streak 133.000
    Samsung Vibrant 111.000
    Samsung Galaxy Tab 108.500
    LG Optimus 2x 102.000
    HTC HD2 1 Ghz 1 core 94.000
    HTC Desire S 90.000
    HTC Desire 84.000
    iPhone 4 80.000
    ipod touch 4 78.000
    samsung galaxy s 70.000
    7-inch Barnes and Noble Color 62.500
    Sony PRS-T1 ebook-reader 53.817
    iPhone 3GS 51.000
    HTC Wildfire S 31.000
    Android phone Qualcomm 600MHz (1 core) 25.750
    Ipod touch 2nd gen 16.500
    Palm Pre oc. 1Ghz (Webos1.4.5) 16.000
    ZT-180 10,2" Pad 15.345
    Palm Pre oc. 800mhz (Webos1.4.5) 13.000
    APAD Rockchip 600mhz (android1.5) 10.000
    Palm Pre 500mhz (Webos1.4.5) 8.000


    Apple's A7 is absolutely unrivaled, especially considering that its closest challengers have four or height cores and twice the clock speed. This is very impressive. No wonder people start to think apple will even switch Macs to the ARM architecture...

    Those charts are updated frequently, as there's always someone to try Stockfish on a new device. :-)

    As the text formatting gets lost here, here's a screen capture of the original post :
    ARMcharts.png

    [ Edité par JuLieN 06.04.2014 - 01:51 ]
    "A good bug is a dead bug" (Don Dailey)
  • »05.04.14 - 22:33
    Profile Visit Website
  • Caterpillar
    Caterpillar
    JuLieN
    Posts: 35 from 2008/4/16
    From: France
    By the way, if you want to contribute and give me the benchmarks of your non-listed devices here's how to proceed :

    - iOS :
    - Download Stockfish on the Appstore.
    - chose "game -> new game -> analysis"
    - wait for one minute so the figures stabilizes, and you've got the NPS measurement in kN/s. (Kilo Nodes per seconds).

    - Android :
    - Download Droidfish on the PlayStore.
    - launch the game and go into its parameters : make sure "pondering" and "Display the computer's evaluation" are checked.
    - in the "execution tasks" part of the settings, make sure it is set to the number of cores your device has.
    - Now launch a new game as white and wait a minute so you get a good measurement.

    On my iPad Mini 2 I get Around 1100 Kn/s (> 1.100.000 nodes /seconds), and on my Galaxy Note 3 I get around 480 Kns.

    If you possess devices not listed above, please do this test and I'll update the charts (Thorsten and I are the main authors of these charts.)
    "A good bug is a dead bug" (Don Dailey)
  • »05.04.14 - 22:43
    Profile Visit Website
  • Caterpillar
    Caterpillar
    JuLieN
    Posts: 35 from 2008/4/16
    From: France
    To compare with desktop CPU, my Macbook has a Core i7 running at 2.3 Ghz, with 4 cores (8 threads with hyper threading), and stockfish runs at around 5.3 M Nps. As it has 4 times more threads than the iPad Mini 2; you can see that both CPUs are about as powerful per core, despite the A7 runs at half the speed of the Core i7. This is VERY impressive !
    "A good bug is a dead bug" (Don Dailey)
  • »05.04.14 - 22:47
    Profile Visit Website
  • Yokemate of Keyboards
    Yokemate of Keyboards
    takemehomegrandma
    Posts: 2720 from 2003/2/24
    Quote:

    JuLieN wrote:
    To compare with desktop CPU, my Macbook has a Core i7 running at 2.3 Ghz, with 4 cores (8 threads with hyper threading), and stockfish runs at around 5.3 M Nps. As it has 4 times more threads than the iPad Mini 2; you can see that both CPUs are about as powerful per core, despite the A7 runs at half the speed of the Core i7. This is VERY impressive !


    This is *very* interesting, thanks everyone for sharing the info!

    I'm sure more is to follow... :-)
    MorphOS is Amiga done right! :-)
    MorphOS NG will be AROS done right! :-)
  • »06.04.14 - 06:37
    Profile
  • Yokemate of Keyboards
    Yokemate of Keyboards
    takemehomegrandma
    Posts: 2720 from 2003/2/24
    Quote:

    JuLieN wrote:
    On Talkchess forum we have a great test to benchmark ARM cpus : the Stockfish chess engine. It's perfect because :
    - it uses all the cores you feed it with,
    - it is available for Android and iOS
    - chess programs love 64 bits, so you clearly see the difference between 32 and 64 bits CPUs.

    We measure the average NPS Stockfish displays (NPS : Nodes Per Second : how many nodes in the chess tree Stockfish travels through pers second.

    Here's our current charts (most powerful devices on top) :

    Quote:

    Galaxy S3 (4 cores) 574.000

    SamsungGalaxy S4(1.9Ghz,4 cores,1GBhash)450.000 new



    So the Galaxy S3 is actually considerably faster than the Galaxy S4?
    MorphOS is Amiga done right! :-)
    MorphOS NG will be AROS done right! :-)
  • »06.04.14 - 06:48
    Profile
  • ASiegel
    Posts: 1377 from 2003/2/15
    From: Central Europe
    Quote:

    JuLieN wrote:
    As it has 4 times more threads than the iPad Mini 2; you can see that both CPUs are about as powerful per core, despite the A7 runs at half the speed of the Core i7. This is VERY impressive !


    This assumes that the used software provides meaningful data for comparing both processors. This may be a fallacy. Unfortunately, many tools that are widely used to review the performance of desktop processors are not available on mobile platforms which makes it rather difficult to do comprehensive comparisons between mobile and desktop CPUs.

    That being said, the performance of smartphones in general is most certainly impressive, especially if you consider that today's high-end phones are roughly ten times faster than their predecessors were a mere 5 years ago (iPhone 3GS vs iPhone 5S, for example).
  • »06.04.14 - 09:14
    Profile
  • Caterpillar
    Caterpillar
    JuLieN
    Posts: 35 from 2008/4/16
    From: France
    Quote:

    takemehomegrandma a écrit :
    Quote:

    JuLieN wrote:
    On Talkchess forum we have a great test to benchmark ARM cpus : the Stockfish chess engine. It's perfect because :
    - it uses all the cores you feed it with,
    - it is available for Android and iOS
    - chess programs love 64 bits, so you clearly see the difference between 32 and 64 bits CPUs.

    We measure the average NPS Stockfish displays (NPS : Nodes Per Second : how many nodes in the chess tree Stockfish travels through pers second.

    Here's our current charts (most powerful devices on top) :

    Quote:

    Galaxy S3 (4 cores) 574.000

    SamsungGalaxy S4(1.9Ghz,4 cores,1GBhash)450.000 new



    So the Galaxy S3 is actually considerably faster than the Galaxy S4?



    Yes, we've been astonished as well. Sometimes, behind the advertising, there are regressions for real life applications... :-/
    "A good bug is a dead bug" (Don Dailey)
  • »06.04.14 - 10:55
    Profile Visit Website
  • Caterpillar
    Caterpillar
    JuLieN
    Posts: 35 from 2008/4/16
    From: France
    Quote:

    ASiegel a écrit :
    Quote:

    JuLieN wrote:
    As it has 4 times more threads than the iPad Mini 2; you can see that both CPUs are about as powerful per core, despite the A7 runs at half the speed of the Core i7. This is VERY impressive !


    This assumes that the used software provides meaningful data for comparing both processors. This may be a fallacy. Unfortunately, many tools that are widely used to review the performance of desktop processors are not available on mobile platforms which makes it rather difficult to do comprehensive comparisons between mobile and desktop CPUs.

    That being said, the performance of smartphones in general is most certainly impressive, especially if you consider that today's high-end phones are roughly ten times faster than their predecessors were a mere 5 years ago (iPhone 3GS vs iPhone 5S, for example).



    Well this one tool, Stockfish, is available for both desktop and mobile platforms. And as speed is very important for chess engines, it is each time carefully optimized. For instance, the Core i7 version has a dedicated binary that makes use of the POPCNT instruction (very useful for bitboards chess engines).
    "A good bug is a dead bug" (Don Dailey)
  • »06.04.14 - 11:02
    Profile Visit Website
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Andreas_Wolf
    Posts: 12199 from 2003/5/22
    From: Germany
    Update:

    >> Looks like Freescale has some interesting plans:
    >> http://www.techradar.com/news/computing-components/processors/freescale-sampling-64-bit-arm-processors-already-launching-later-this-year-1230508

    > Yes, this is some nice additional info to what was already known
    > about QorIQ LS chips based on Cortex-A50.

    First members of the QorIQ LS2 series with Cortex-A57 core announced to be available in 2H 2014:

    http://media.freescale.com/phoenix.zhtml?c=196520&p=irol-newsArticle&ID=1916610
  • »08.04.14 - 22:33
    Profile
  • Jim
  • Yokemate of Keyboards
    Yokemate of Keyboards
    Jim
    Posts: 4977 from 2009/1/28
    From: Delaware, USA
    Quote:

    Andreas_Wolf wrote:
    Update:

    >> Looks like Freescale has some interesting plans:
    >> http://www.techradar.com/news/computing-components/processors/freescale-sampling-64-bit-arm-processors-already-launching-later-this-year-1230508

    > Yes, this is some nice additional info to what was already known
    > about QorIQ LS chips based on Cortex-A50.

    First members of the QorIQ LS2 series with Cortex-A57 core announced to be available in 2H 2014:

    http://media.freescale.com/phoenix.zhtml?c=196520&p=irol-newsArticle&ID=1916610


    Not quite as fast as I was hoping for and only A57 cores, but still promising.
    RISC seems to be alive and well at Freescale.
    Interesting seeing them make a point of mentioning the continuing development of PPCs.
    BTW - What the heck does "already launching this year" mean? Already?
    "Never attribute to malice what can more readily explained by incompetence"
  • »09.04.14 - 01:28
    Profile