|
This is a printable version of the article found at http://onepc.net/reviews/0026/ To print, press CTRL + P on your keyboard now. |
By Kelly Lu (kellylu@onepc.net) - January 7, 2001
| |||
|
After the release of the original Voodoo, the 3D acceleration market has seen the rises and falls of mighty corporate empires. First came S3: being the first out the gate with a 3D accelerator, they were hugely criticized for the incredibly poor performance of their Virge 3D accelerator (or decelerator, as how it was labeled by some). Then came 3Dfx with their Voodoo line of products that, as a matter of fact, performed very well for that time and was praised as the "ultimate gaming card." It was not after the release of the 3rd encarnation of the Voodoo line, the Voodoo3, that 3dfx (now with a lowercase d, after its merger with STB) began to loose its grip on the 3D accelerator market to a company called NVIDIA, which we now know so well.
NVIDIA, with their TNT2 graphics processor, seemed ready to take on the world when their product, in terms of performance, was neck to neck with the Voodoo3, but had a huge advantage over its fierce competitor--it was capable of rendering in 32-bit colors. As many of us may recall, when 3dfx was designing their Voodoo3, they had neglected to include support for 32-bit rendering, simply downplaying the feature and saying how much of a performance loss it will instill and how, in terms of graphics quality, there was not much of a difference at all. It's competitors, namely NVIDIA and ATI, set out to prove 3dfx wrong and, as history shall tell, each produced products that did just that. ATI, with its Rage128, and NVIDIA, with its TNT2, both showed 3dfx that it was possible to render in 32-bit without a significant loss in performance. That is when NVIDIA rose to its fame, as its TNT2 product was the fastest of the three, although ever-so close to the Voodoo3, and that is when 3dfx saw their piece of the pie slowing chipping away.
Since then, we've seen NVIDIA's bit increase to a large piece of the pie and 3dfx's shrink to a measly portion. NVIDIA, now being recognized by some as the Intel of the graphics world, has the world's fastest GPU, their GeForce2 Ultra (although, it's also the world's most expensive consumer level 3D accelerator). 3dfx, after some rumors in the year 2000, is no more: it is now in the middle of being gobbled up by its biggest competitor, NVIDIA. ATI, which had previously specialized in their above-all video/DVD playback quality, has also jumped on board the high-performance 3D acceleration market with their Radeon GPU that performs neck to neck with, if not better than, NVIDIA's GeForce2 GTS offering in some cases. S3, after being the first to design a 3D accelerator and later tried to gain some market-share with their poor-performing and buggy Savage4 and then the Savage2000, is no longer in the graphics acceleration business.
Yet after all these years of innovation in terms of 3D acceleration, all of the most popular products available today are still using a technique that first originated in the 1960's for generating CAD graphics. However, two companies working together, STMicro and Imagination Technologies think that a need for change is finally here and, through their new chipset called the KYRO, being the name of their PowerVR Series 3 graphics chip, utilizes a completely different rendering technique that the two companies hope will take the world by storm. It is called Tile-based rendering...
So, how is it that tile-based rendering is so much different from conventional rendering methods that popular products from NVIDIA, ATI and 3dfx utilize? To delve into this complicated topic, let me first explain to you how conventional rendering methods work and then compare that to the KYRO's tile-based rendering; however, before we begin, I must mention that all methods of rendering techniques have one sole purpose: to generate a graphical 3D scene that can be displayed on a 2D screen using millions of mathematical calculations and enormous amounts of computer data. The diagram below shows the task that all video cards essentially perform:

The conventional rendering method that I'm referring here is called "immediate mode rendering" and is currently used by most of today's consumer-level graphics products. The basic process that it follows is this:

The key here is that the accelerator will process the entire scene as a whole and each polygon of the scene individually without knowledge of which goes in front of which. Textures are then applied to every face of each polygon; again, without knowledge of which goes in front of which. Once the entire scene has passed this stage and every bit of each polygon is textured, it is then passed to the "depth test," a stage in the rendering process that utilizes the z-buffer. What is a z-buffer, I hear you all crying? The z-buffer is a portion of the graphics system that is used to store the depth of each pixel in the scene. In immediate mode rendering, a portion of the graphics memory is used for storing the z-buffer. Using the z-buffer, the textured polygons in the scene finally know which is in front of which and this is the stage that the overlapping portions of the polygons are simply discarded. The overlapping aspect is called "overdraw" and will prove important for the improvements that tile-based rendering can bring. The scene is then passed on to the final stage where the textured polygons are blended for transparent polygons and finally drawn and sent to the frame buffer for displaying.
By looking at this method, it seems that there is a lot of work and memory bandwidth being wasted on portions of the image that will eventually be thrown out anyways at the end. That's exactly what the KYRO aims to change with its tile-based rendering method.
The tile-based rendering method allows the KYRO to reorganize the different rendering stages and, essentially, follows this process:

The secret to the tile-based rendering method lies right in the name itself. Instead of rendering the entire scene at one time and each polygon individually, they're grouped together into what are called "display lists" and, because of such, allows the entire scene to be literally cut up into small "tiles," each rendered individually by the KYRO.
The first benefit that arises from rendering each tile individually rather than an entire scene at a time lies in the aforementioned z-buffer. As each tile is rendered individually, the required size for the z-buffer becomes considerably less, thus, allowing the KYRO to have an on-chip z-buffer. Having an on-chip z-buffer eliminates the need for continual memory accesses and, therefore, freeing up the memory for other tasks, such as textures storage.
The rendering of each tile individually instead of each polygon individually now means that the graphics accelerator has knowledge of which polygon is in front of which before it applies the textures to the polygons. This is very important as the depth test, which was the 2nd stage in immediate mode rendering, becomes the very first thing that the graphics accelerator performs on the scene and, therefore, means that it can now discard any pixels that are not going to appear on the final rendered scene before it must go to the video memory for those huge textures that it must apply to each polygon. Doing so will eliminate virtually all overdraw. This saves huge amounts of needed memory bandwidth, as a lot of data has already been discarded before the scene goes through the rest of the rendering pipeline.
The rest of the rendering process is pretty much self-explanatory, as after the depth test, the textures are applied to the remaining pixels and then the final product is sent to be blended (for transparent polygons) and then displayed.
One of the features that many said was missing from 3dfx's Voodoo3 chip was the ability to render in 32-bit color. We can recall back then, in the days when Quake II and Half-Life boasted top-of-the-line graphics, that not many games utilized 32-bit rendering. Seeing this, 3dfx had neglected to include support for 32-bit rendering, which, basically, made the Voodoo3 much-less future-proof than it's competing product, the TNT2 from NVIDIA. This fact, I think, contributed considerably to the downfall of 3dfx and the rise of NVIDIA.
Since then, we haven't seen a single new product without 32-bit rendering support. NVIDIA had that support dated all the way back to their original TNT product, ATI introduced 32-bit rendering in their Rage128 and, later, improved on it on their Radeon and even 3dfx has come onto the scene, albeit too little too late, with their Voodoo4/5 products.
Unfortunately, all of the above-mentioned products still show a considerable performance loss when rendering at 32-bit, simply because they are using the conventional method of rendering. The reason why there is such a considerable loss in performance lies in the z-buffer, once again. To refresh our short-term memory, using conventional rendering, when the scene finally hits the stage where it utilizes the z-buffer, the polygons have already been textured with the specified color-depth. Because of this, a polygon that has 32-bit textures will require the z-buffer to use up twice the amount of memory bandwidth than a polygon that has 16-bit textures applied to it.
With tile-based rendering, we must be reminded that the z-buffer is on-chip, therefore, not utilizing the slower memory bandwidth. Because of the fact that the z-buffer is on-chip, everything can be done in 32-bit mode without the need for double the memory bandwidth requirements that traditional accelerators require. In fact, even 16-bit scenes are done in 32-bit and then dithered down to 16-bit when needed when using the KYRO. This is what Imagination Technologies and STMicro call Internal True Color and allows for superior 16-bit rendered image quality over conventional 3D accelerators.
There are many different types of textures that can be applied to almost any polygon: from regular image textures to light textures to bump-mapped textures, each providing an added sense of realism to 3D objects. Textures are applied as "layers" to the objects that they are bound to, so if a designer wishes to apply a light-map texture on top of the base image texture, that polygon requires two layers of textures. Observe:

This is what we call multi-texturing and the KYRO is the mother of all multi-texturing graphics accelerators. Take NVIDIA's GeForce2 GTS, for example: this powerful chip has two layers of multi-texture support. This means that the GeForce2 GTS is capable of rendering a maximum of 2 texels, or two texture layers, per pixel. If the number of texels exceeds 2, then the GeForce2 GTS has to use multi-pass to render the scene, meaning that it has to go back and redo those layers that it can't fit into one pass, which reduces both performance and image quality. ATI's Radeon graphics accelerator has support for 3-layer multi-texturing. So, by using the above example with the GeForce2 GTS, we can conclude that it is capable of rendering a maximum of 3 texels per pixel.
Now, the Radeon seems to have an advantage over the GeForce2 GTS here because of it's 3-layer multi-texturing support. This is one of the main reasons why the GeForce2 GTS fails to render ATI's demo scene for the Radeon because of how the demo uses an abundance of 3-layer textures.
Now, 3-layers may sound impressive when compared to the GeForce2 GTS's 2-layer support, but wait until you hear this: the KYRO has a whopping 8-layers of multi-texture support. Now, I don't know of any games that utilize 8-layer multi-texturing, but I'm sure that once games start using it, they will look more realistic than ever and will cost the KYRO little to virtually no performance drop.
First out of the gates with the KYRO processor is PowerColor's Evil KYRO. At $139, PowerColor has priced this card to compete directly with products based on NVIDIA's mid-end GeForce2 MX GPU.
Specifications
We must note that the Evil KYRO has only a 270MHz RAMDAC. Most of the popular cards we've seen lately have featured RAMDAC's of well-above 300MHz, but, in fact, the Evil KYRO's small, in comparison, 270MHz RAMDAC was able to produce a crisp and clear picture up to as high of a resolution as 1600x1200 @ 60Hz (as high as my monitor would go).
We received the 64MB version of the Evil KYRO for this review. The card came in a box (gasp!) and, in the box, we get the usual PowerColor bundle: a fair manual that details not only how to install the card and the included software, but also explains some of the features of the card and a CD with drivers for Windows 98/ME/NT4.0/2000 and versions of both WinDVD (for playing DVD's) and VCD PowerPlayer SE (for playing VCD's). Also included is the full version Test Drive 5, a popular game that's getting quite old by now.

The card itself seems very much "quiet" when compared to the other cards in our test lab. While other cards are very busy with chips, capacitors and the like all over their PCB's, the Evil KYRO had a very clean design. A small-sized heatsink/fan combo was mounted onto the KYRO chip itself and a generous amount of thermal paste provided sufficient heat transfer. Prying off the heatsink gives us an up-close and personal look at the KYRO chip itself.

I did find some things quite peculiar about this card, as there were two sets of jumpers on the board to set the card into either AGP 2X or 4X mode. Although it comes default at 4X, most other cards nowadays don't have these jumpers and depend on the motherboard for setting the AGP setting. This could be useful for troubleshooting.
There were 8x8MB M.tec 7ns. SDRAM chips surrounding the KYRO chip operating at 125MHz to provide the Evil KYRO with 64MB of video memory.
Installation & Drivers
Installation was pretty much uneventful as all that was required was to unplug the old AGP card and plug in the new Evil KYRO. As Windows boots up, it will detect the card and ask you to insert the CD whereupon you can choose the correct driver for the card. You can also choose to install the included PowerColor toolbox that gives you information about your card (such as driver versions) and monitor and allows you to change settings such as resolution and refresh rate. Unfortunately, the included toolbox does not include a tab with which to set the core/memory clock speed for overclocking purposes.
The included drivers were reference KYRO drivers provided by STMicro. Because of how much I've worked with NVIDIA's drivers, it took some time to get used to the new interface but after I got used to using it, I was able to access many of the card's advanced features.
Below are some screen shots of the driver settings themselves and of PowerColor's toolbox:


This review of the PowerColor Evil KYRO marks the first time ever that 2 test systems have been used to benchmark a product here at OnePC. In the past, our pool of hardware was quite limited; therefore, only allowing us to assemble one test system. Now that OnePC has grown considerably, our hardware pool is considerably larger; therefore, allowing us to use multiple test systems in our reviews.
| Test System #1 - High-End | |
| CPU | Intel Pentium III 1GHz |
| Motherboard | ASUS CUSL2 (i815E chipset) |
| Memory | 1 x 128MB Micron OEM PC133 SDRAM @ CAS3 |
| Hard Drive | Quantum Fireball Plus LM 30GB |
| Network | Realtech 10BaseT Network Interface Card |
| Test System #2 - Low-End | |
| CPU | AMD Duron 650MHz |
| Motherboard | SOYO SY-K7VTA (KT133 chipset) |
| Memory | 1 x 128MB Micron OEM PC133 SDRAM @ CAS3 |
| Hard Drive | Quantum Fireball Plus LM 30GB |
| Network | Realtech 10BaseT Network Interface Card |
| Configuration | |
| Video |
PowerColor Evil KYRO PowerColor PowerGene GeForce2 MX PowerColor PowerGene GeForce2 GTS |
| Operating System | Microsoft Windows ME (4.90.3000) |
| Special Drivers |
VIA 4-in-1 Pack (4.24) for VIA chipsets NVIDIA Detonator 3 Reference drivers (6.31) for NVIDIA cards KYRO Reference drivers (4.12.01.1544-1.00.02.0163) for the Evil KYRO |
| Software |
3DMark2000 1.1 (build 340) Quake 3: Arena (Point-Release 1.17) Unreal Tournament (v.4.32) VillageMark 1.1 |
Driver Issues
Like all new products, the KYRO had some pretty serious driver problems pertaining to compatibility issues. Usually, compatibility issues only arise on non-Intel chipset & processor solutions, such as the AMD and VIA combo we've now gotten so used to seeing, but the KYRO actually had some problems with our High-End test system, which was, ironically, based around Intel's 815E chipset and an Intel Pentium III 1GHz processor, while it ran perfectly flawless on our AMD and VIA combo.
The problem was that, quite frequently, 3D applications (3DMark2000 and VillageMark) and games (Quake 3: Arena and Unreal Tournament) would lock up after being run once or twice. Windows did not lock up and CTRL + ALT + DELETE worked, but the 3D application just failed to continue running. I did not find a way around this so, when I was benchmarking the Evil KYRO on the High-End test system I would have to live with constant crashes.
I even took the liberty to ask the technical support staff at PowerColor regarding this problem, but they informed me that the drivers included on the included driver CD are the newest ones around.
Like usual, I'm going to start out with our ol' faithful 3DMark2000. I've said this time and time again about how this benchmark not only stresses quite strongly on your video sub-system performance, but also considerably on your system platform.

Both the NVIDIA cards are running with hardware T&L enabled in this benchmark while the Evil KYRO, which doesn't support hardware T&L, is running with software T&L. This is not a fair comparison between each card, but can show you how each card performs under "best-case" scenarios.
As you can see, even without hardware T&L, the Evil KYRO can keep up quite nicely with the NVIDIA duo, even topping the GeForce2 MX at 32-bit a few times in the higher resolutions and then falling back to last place when in 16-bit rendering mode. This is just one of the many examples of the KYRO's superior 32-bit rendering that gives it such a boost in performance when compared to the NVIDIA cards at 32-bit color.

In our low-end system, the Duron 650 was basically a bottleneck for all three cards as the performance level plateaued at the lower resolution. At the higher resolutions, the video cards' fill-rates becomes the bottleneck and so, we see the same pattern of the KYRO taking a lead over the GeForce2 MX at 32-bit color.
The GeForce2 GTS continues to dominate in both test systems.

Now, things start getting interesting. To even the tables, I decided to turn off hardware T&L support altogether, forcing all three cards to run using software T&L. With this setting, we see that the GeForce2 MX becomes the slowest of the bunch and the KYRO and GeForce2 GTS exchanging leads. At the lower resolutions, the KYRO leads because of its more-advanced technology, but at the higher resolutions, it falls short simply because of the GeForce2 GTS's raw power.

Again, we see that the Duron 650 processor in our low-end system becomes the bottleneck for our performance levels. As soon as we hit the higher resolutions where the platform becomes less and less of the bottleneck and the video card handling more and more of the weight, we see that the GeForce2 GTS just slightly edges out the KYRO for top spot. The GeForce2 MX is simply hurting, dragging along in last place through all the resolutions.
Through the last 2 graphs, we begin to see the benefits of the Evil KYRO appear. With hardware T&L, the two NVIDIA cards clearly wins over the KYRO, but without hardware T&L, they fall to the wrath of the KYRO's advanced rendering methods.
To help us see the difference in how the KYRO scales in resolution and color depth when compared to the NVIDIA cards, I decided to move away from the usual bar graphs and compiled the following line graph. The numbers in the following graph are the same as the ones found in the bar graphs for 3DMark2000, but the line graph shows us a certain pattern that can be hard to see with the bar graphs...

... and a pattern we do see! Notice how both the GeForce2 MX and GTS dip down when hitting 32-bit rendering and then slide back up when moving onto 16-bit rendering, even if it's a resolution "step" above the former 32-bit mark. Now notice how the KYRO makes a gradual dive as it goes through the different resolutions. In fact, you can even see that the slope of the lines going from 16-bit to 32-bit are ever-so slightly smaller than the lines going from 32-bit to 16-bit. This is an excellent example that clearly shows the Internal True Color rendering at work, a benefit from the KYRO's tile-based rendering.
Quake 3: Arena is my all-time favourite for benchmarking video subsystems. Unlike its competitor, the Unreal engine, the Quake 3 engine stresses the raw power of video cards to their max. It also manages to stress the system platform just enough so that it also makes a big difference in the frame rate.
First up is the good-ol' "demo001" that comes with Quake 3: Arena running on our high-end system:

Here, we see the Evil KYRO take on the GeForce2 MX with some convincing force. The two cards exchange places for second place, behing the GeForce2 GTS, with the GeForce2 MX taking the lead in 16-bit rendering and the KYRO in 32-bit rendering. Again, these are signs of the impressive 32-bit performance that tile-based rendering can bring to the KYRO.

We again see the same pattern on our low-end system as we did in our high-end system with the GeForce2 MX and the Evil KYRO exchanging places for second place. We also see again that the Duron 650 processor becomes a bottleneck for all three cards at the lower resolutions.
For this review, I decided to try out a benchmark that stresses the video performance more than the good-ol' "demo001." Quaver is a popular benchmark that is based on the Q3DM9 map, which is a map notorious for its use of enormous textures. It's also an indoor map that feeds the video card slightly more overdraw than the map found in "demo001." Both of these features of Quaver should give the Evil KYRO a slight advantage over the NVIDIA siblings because the KYRO's tile-rendering both reduces memory bandwidth used (i.e. caused by large textures) and eliminates all overdraw.

Well, unfortunately, the change to enormous textures and slightly more overdraw didn't affect the pattern that we witnessed earlier with demo001 much. The GeForce2 MX and the Evil KYRO continues to exchange leads, but with quaver, it looks like the GeForce2 MX's leads have been cut ever so slightly and the KYRO's leads have been increased ever so slightly. This can probably be attributed to the above-mentioned differences that the demo "Quaver" has when compared to "demo001."

Nothing new to see here...
In the past, I have always reserved using Unreal Tournament as a benchmark for motherboard or processor reviews because of how much it utilizes the system platform rather than the video system. However, since the KYRO is so durastically different from previous video cards that I've tested, I had to included UT into the benchmarking bunch to see how it matches up against the others.
Of course, to help me get better results from Unreal Tournament, I had to use a different benchmarking demo than I had used before. The demo that I had been using in the past was called "wicked400" and it was designed to test the system platform performance rather than video performance. To help me better test video performance, I chose to use the demo "Thunder," which claims to scale much better than "wicked400" as the resolutions are cranked up.

The results above show that the KYRO is not up to par with the results obtained from the GeForce2 MX and GTS cards. Both the NVIDIA cards obtained higher frame rates at the lower resolutions, even at 32-bit resolution where the KYRO got oh-so close to the GeForce2 MX score, but just didn't have the oomph to put it ahead. As we move up into the higher resolutions, we see that the GeForce2 MX finally runs out of steam and so, gives the Evil KYRO a chance to overtake it at resolutions of 1024x768 and above @ 32-bit color.

Here's where it once again becomes very interesting. The first thing that we notice is that the results remain steady in the lower resolutions because of how processor/memory-intensive the Unreal engine is. Secondly, it seems that, contrary to what we witnessed with our high-end system, the results at the lower resolutions ended up in favour of the Evil KYRO for the very first time. This leads me to believe that either NVIDIA's Detonator drivers are not quite optimized for AMD platforms as they are for Intel platforms or that the KYRO's reference drivers are more-so tuned for AMD platforms. Seeing that the drivers had difficulty in running the card stably on our Intel platform, which is very wierd in itself, I will have to edge with the latter reason that the KYRO's drivers are better-tuned for AMD platforms--something that is rarely seen as it is often the other way around.
It's also very interesting when looking at the higher resolutions. At 1024x768 @ 32-bit, the GeForce2 GTS actually moves ahead of the 32-bit-power-rendering-house Evil KYRO, but then seems to die off again at the next 32-bit resolution "up," 1280x1024 @ 32-bit. This example shows how the tile-based rendering method of the KYRO is finally paying off as its sleek elegance out-performs the GeForce2 GTS's raw power.
Much like how NVIDIA has their own set of benchmarks and demos to show of the features and speed of their products, STMicro also has a set of their own benchmarks and demos--one of which is VillageMark. Because of how the rendering pipeline is organized in tile-based rendering, the more overdraw a scene has, the faster the KYRO will render when compared to conventional 3D accelerators. VillageMark, as you might expect, has loads of overdraw because of the hundreds of over-lapping buildings in the scene, and thus, we should expect a huge boost in performance when comparing the KYRO to the two GeForce2's.
Below are screenshots of what VillageMark looks like:

Keep in mind that the only color-depth supported by VillageMark is 16-bit, so all tests were conducted at 16-bit color-depth.

As you can see right off the bat, VillageMark is a very tough benchmark, even for the KYRO, which is what the benchmark was built around. At the lower resolutions, we see the KYRO stablize at 76 FPS. From this, you would think that the results were limited by the platform, which, it is not, as you can see from the following graph of running the benchmark on our low-end system:

No, what you're seeing is not a direct duplicate of the results found on the high-end system above. True, the results that I had obtained from both systems were 100% identical! This takes me to believe that this benchmark is almost 100% video subsystem-intensive and does not stress the system platform at all!
We see how both the NVIDIA cards suffer from the lack of tile-base rendering. Even at a resolution as low as 800x600, the powerful, and usually dominating GeForce2 GTS, cannot even manage a playable 30 FPS. The GeForce2 MX is even worse, as it starts off at 640x480 at only 38 FPS.
At $139, the Evil KYRO is priced right between the ranks of NVIDIA's GeForce2 MX and GeForce2 GTS products. Likewise, the performance level of the Evil KYRO, as we had witnessed throughout our benchmark results, showed that the Evil KYRO performs, on average, slightly better than PowerColor's own GeForce2 MX card--that is, depending on what color-depth you normally play in. If you're running your games at 16-bit color all the time and don't think 32-bit will ever make a difference for you, then the Evil KYRO is not your card and I would suggest sticking with a "traditional" card such as the GeForce2 MX or maybe even the GTS if you have the cash. If, however, you're like me and always runs games at 32-bit whenever possible, then you're sure to see a performance benefit from the Evil KYRO over "traditional" cards.
The KYRO's tile-based rendering engine is truely a work of art that, like all hardware technologies, must be embraced by game developers for one to truely see its advantages. As we saw from our benchmarks, current games such as Quake 3: Arena and Unreal Tournament take little advantage of the KYRO's tile-based rendering technique and such, doesn't allow it to perform up to its full potential. On the other hand, if, someday, we find a game that has the same amount of overdraw as the amount we saw in the VillageMark benchmark, then, according to the results obtained from running VillageMark, the KYRO will reign supreme.
Will games in the future embark on the advantages of tile-based rendering? It will take at least a year or two, if we consider the amount of time that NVIDIA's hardware T&L has taken to catch on, until we actually will see games popping up on store shelves utilizing killer overdraw, multiple texture layers and huge textures altogether. Until those games begin to surface, the Evil KYRO's only advantage now would be in its blazing-fast 32-bit rendering, a feature that I feel is very-much needed; but once they do, we will finally see that elegance will out-perform power.
Pros
|
![]() |
Copyright © 2000-2001, OnePC Network Inc. All rights reserved.
OnePC.NET is found online at http://www.onepc.net