• Yokemate of Keyboards
    Yokemate of Keyboards
    Andreas_Wolf
    Posts: 12219 from 2003/5/22
    From: Germany
    Update:

    > Description of the relevant FTF 2012 session (FTF-IND-F0016) now online:
    > "This session describes new instructions for extended support for misaligned
    > vectors, support for handling head and tail vectors, and the long-awaited capability
    > to move from general purpose to vector registers. Learn about the performance
    > improvements resulting from enhancement to both AltiVec and the e6500 core."
    > http://www.getregisterednow.com/FSL/CEX/Session.aspx?li=1

    From the relevant FTF 2012 session slides:

    "AltiVec e6500 core technology is essentially the same as AltiVec technology from the 74xx processors except the following:
    · Adds new instructions for computing absolute differences [...]
    · Adds new instructions for moving data from GPRs to VRs [...]
    · Adds new instructions for dealing with misaligned vectors more easily [...]
    · Adds new instructions for dealing with elements of vectors [...] These allow loading/storing of arbitrary elements to arbitrary addresses
    [...]
    Little-endian byte ordering is not supported on Power ISA AltiVec definition.
    Data stream instructions [...] are not supported on Power ISA AltiVec definition.
    [...]
    IVORs added for AltiVec unavailable interrupt and AltiVec assist interrupt.
    Move from GPR to VR [...] instructions move data from 2 GPRs into a vector register.
    Absolute difference instructions [...] compute the unsigned absolute differences. These
    are useful for motion estimation in video processing.
    Extended support for misaligned vectors [...]
    Extended support for handling head and tail of vectors [...]
    External PID instructions for loading and storing VRs [...] for moving data efficiently across address spaces.
    [...]
    Data Stream Instructions [...] were present in the first definition of AltiVec technology for PowerPC processors. These instructions provided software initiated streaming prefetch controls. In Power ISA these instructions are no longer defined, and streaming is performed by variants of the dcbt instruction or by hardware prefetchers. Cache stashing could be considered an alternative as well. For Freescale EIS, these instructions are treated as no-ops since they may be present in older code and do not change architectural state.
    [...]
    Original AltiVec technology on the e600 core included an AltiVec unavailable exception. IVORs are the equivalent exception mechanism in e500-based cores. The AltiVec unavailable interrupt occurs when an attempt is made to execute an AltiVec instruction and MSR[SPV] = 0. This can be useful in reducing context switch overhead by not saving AltiVec registers unless a process actually uses AltiVec instructions. [...] In general, AltiVec vector instructions generate very few exceptions
    [...]
    Moving GPRs into a Vector Register [...] was a source of frustration in original AltiVec technology. The explanation was that the interconnect between GPRs and VPRs was not warranted when data could be moved quickly via store and load from L1 cache. Still, the capability was desired by many customers. These new instructions will make it simpler to program with AltiVec.
    [...]
    Absolute Difference Intructions [...] are useful for motion estimation in video processing.

    AltiVec e6500 limitations
    · Operates in big-endian only
    · Does not have data streaming (dst type instructions) They are executed as NOPs
    [...]
    New Load and Store Instructions
    · Reduces the effort to load and store unaligned (not quad-word aligned) data
    · Reduces number of registers needed for permute and mask vectors
    · Reduces the effort to deal with the head and tail of unaligned strings or vector arrays
    · Improves performance through:
    - Fewer instructions
    - Less register pressure
    - Less context to save
    · Makes programming AltiVec technology simpler
    [...]
    Summary
    · AltiVec technology is being "advanced" into the e6500 core (after skipping the e500 -- which had SPE, the e500mc, and the e5500) from the e600 core.
    · New instructions to move data from GPRs to VRs will reduce complexity and instruction count.
    · New load and store instructions simplify misaligned accesses and reduce complexity and instruction count.
    "
    http://www.freescale.com/files/training_pdf/FTF/2012/americas/WBNR_FTF12_IND_F0016.pdf (pages 4-14 and 32)
    http://www.freescale.com/files/training/doc/dwf/DWF13_AMF_NET_T0015.pdf (pages 32-50)
    http://www.freescale.com/files/training/doc/ftf/2014/FTF-NET-F0139.pdf (pages 15/16-25/26)

    "AltiVec technology for the e6500 core is essentially the same as AltiVec technology from the e600 core, except for the following:
    - Adds new instructions for computing absolute differences [...] These speed up in the inner loop of motion estimation video processing
    - Adds new instructions for dealing with misaligned vectors more easily [...]
    - Adds new instructions for dealing with elements of vectors [...] These allow loading/storing of arbitrary elements to arbitrary addresses
    - Instructions for moving data from GPRn to vector register [...]

    AltiVec technology for e6500 limitations
    - Operates in big-endian only
    - Does not have data streaming (dst type instructions) They are executed as NOPs
    "
    http://www.freescale.com/files/training_pdf/FTF/2012/americas/WBNR_FTF12_NET_F0117.pdf (pages 45/46 and 48)
    http://2012ftf.ccidnet.com/pdf/0381.pdf (pages 45/46 and 48)
    http://www.freescale.com.cn/cstory/ftf/2012/pdf/0381.pdf (pages 45/46 and 48)


    Edit: added more PDF links

    [ Edited by Andreas_Wolf 21.04.2014 - 18:59 ]
  • »09.08.12 - 23:29
    Profile