Posts: 466 from 2003/4/11
As a quick introduction for those of you who don't know, I am the MorphOS developer also known as Mark Olsen. Together with Frank Mariak, Michal Wozniak and Nicolas Sallin we're responsible for for graphics in MorphOS, doing everything from basic hardware support (such as AGP) to high-level APIs (such as OpenGL).
Since the release of MorphOS 3.2, there has been a lot of discussion about MorphOS' R300 support. While R300 GPUs have been supported in 2D only earlier than MorphOS 3.2, using R300 GPUs with MorphOS outside of laptops had been quite rare before MorphOS 3.2. With MorphOS 3.2, Powermac G5 support was also introduced, meaning that MorphOS got a lot of new R300 users as well. This post aims to clear up some misunderstandings when it comes to MorphOS R300 support as well as talk a bit about future R300 improvements in MorphOS.The Hardware
First I want to explain a bit about the R300 family of GPUs. There is a total of 11 different chips in the R300 family of GPUs. For desktop class chips they are the R300, R350, R360, RV350, RV351, RV360, RV370 and the RV380. For laptops they are the M10, M11 and the M12. While the main difference between these chips is their hardware configuration (such as how many rendering pipelines they have), another big difference is their system bus support. The most common chips seen in MorphOS systems are the RV350 (most Radeon 9600s), the R350 (Radeon 9800, Radeon 9800 SE and Radeon 9800 Pro) and the M10/M11/M12 chips found in the laptops, which are all based on RV350. Hardware Compatibility
Many users reported that their Radeon 9800 cards did not work with MorphOS 3.2. While it is a common belief that Radeon 9800s simply did not work with MorphOS 3.2, that's not true. In fact my initial R300 development system was an Efika with a Radeon 9800 Pro card installed.efika-9800.jpeg
The issue here was that the R350 used in the Radeon 9800 has a compatibility problem with the AGP controller found in Powermac G5 systems, so only the combination of the two caused problems. Unfortunately G5 machines is where the vast majority of all 9800s are installed, so that's why it seemed like that Radeon 9800 cards did not work at all.
Morphzone user Papiosaur gave me a Radeon 9800 Pro Mac card. I spent a couple of days analysing this problem from the driver point of view and found out that AGP transfers of a certain specific size would cause the lockup. I wrote some code that made the driver avoid submitting that exact amount of data to the graphics card, by padding some extra data at the end of such requests. However, while doing so made the 9800 work a lot better, in the end such a workaround proved impossible to make 100% stable as it was not possible to control how the Radeon scheduled data transfers when it had several outstanding requests waiting. The solution became to completely disable AGP support on such setups, meaning that in MorphOS 3.3 Radeon 9800 cards on G5 machines operate entirely with MMIO.3D Performance
Some users have expressed dissatisfaction with the 3D performance of R300 cards on MorphOS. It is very true that compared to the R200, which has seen a lot of optimisation work done on it, R300 cards don't perform particularly well on MorphOS 3.3 compared to R200 cards.mini_g5-9600_g5-9800pro_fodquake-0.3_morphos-3.3.png
As you can see, in my Fodquake test the Mac Mini with its M9+ (an RV280 chip) easily beats the G5 with both its Radeon 9600 and Radeon 9800 Pro options. What's worse, the Radeon 9600 seems to outperform the Radeon 9800 Pro in the G5.
So what's going on here? First, let's take a look at the theoretical performance of these 3 setups. While the Radeon 9600 and 9800 Pro have a lot more pixel processing power, this is something that's not very relevant for the type of 3D games we currently have on MorphOS. These all use the fixed function OpenGL pipeline and thus tax the GPU's pixel processing power very, very lightly. What really matters for 3D performance on MorphOS is how fast you can read textures, read and update the Z buffer and write to the frame buffer. These are all mainly determined by the memory bandwidth of the graphics cards, so let's have a look at the theoretical maximum memory bandwidth of these GPUs:
Radeon 9200: 6.4GB/s
Radeon 9600: 6.4GB/s
Radeon 9800 Pro: 21.76GB/s
So actually the theoretical memory bandwidth of the Radeon 9200 and the Radeon 9600 is the same. The Radeon 9600, despite its 'high' model number, is actually a rather slow Radeon model when it comes to memory performance. The 9800 Pro, though, should perform considerably better considering it has 3 times the theoretical memory bandwidth of the Radeon 9200 and 9600. For an explanation, we have to look back at the problem with R350 GPUs and Powermac G5 AGP I mentioned earlier. The G5/Radeon 9800 combination is being crippled by the fact that all vertex data and all register updates are being pushed to the graphics card one word (4 bytes) at a time. This really, really cripples the performance of the Radeon 9800.
I spent a couple of days working on getting more performance out of R300-based graphics cards. First I started with the Radeon 9600. It being a low performance card, I focused on optimising the vertex data processing to bring it up on par with the equivalent code in the R200 driver. This helps the system keep the graphics card busy as often as possible, improving the overall frame rate obtained.g5-9600_fodquake-0.3.png
The results speak for themselves. In low resolution, the G5/9600 combination jumped from 82.8 to 159 FPS. In high resolution, the performance improvement was more modest, going from 57.1 to 61.7 FPS. Still, this brings the Radeon 9600 a lot closer to the performance of the Radeon 9200, which it should be about as fast as. Next up came the Radeon 9800. It running on MMIO on G5 machines, I had to put some extra effort into making sure that as little data as possible was pushed through its command buffer. Combining the vertex data optimisation with more aggressive OpenGL state change optimisations resulted in a
clear performance win for the G5/9800 combination at both low and high resolutions.g5-9800_fodquake-0.3.png
In both low and high resolution, the G5/9800 combination now gives a more than 3 times as high framerate as it does with MorphOS 3.3.
I rarely log in to MorphZone which means that I often miss private messages sent on here. If you wish to contact me, please email me at [username]@asgaard.morphos-team.net, where [username] is my username here on MorphZone.