Discussion Forums > Technology

Your view on AMD's Bulldozer

<< < (43/43)

per:

--- Quote from: kitamesume on December 26, 2011, 07:34:40 AM ---^ to begin with, any people wouldn't need a rig much faster than an i3-2100 @ 125$ or an i5-2300 for 185$...
and these could run at H61 boards for like 60$...

or if you'd like a muuuuuuch cheaper quad then how about an Athlon II X4 for like 100$?
Edit: or a Phenom II X6 1055T for like 150$?

PS: theres only a handful of softwares that'll profit on all 8cores of that thing.

--- End quote ---

The performance of Bulldozer is not really the worst part, it is the performance/price and performance/watt ratios that are both a bit too low. A i5 2500K easily beat a bulldozer on both counts, and is faster in almost all applications as well.

The Phenom line is being discontinued (or, well, used to brand new bulldozer versions).
I guess AMD could not have them beating bulldozer in both performance and price in most applications.

And as for software, video recoding and compilation are two things that both use almost any number of cores.

Unfortunately the first suffers from the floating point performance of bulldozer, and the second from the cache/memory bandwidth-performance. So it is actually faster on a four-core intel CPU. And bulldozer does not have eigth full cores.

--- Quote ---It's an interesting idea with potential to have the integer calculations steered to one core and the floating point operations steered to the other core, but it will also need the application to be compiled with that approach in mind.  Until that happens, benchmarks and the applications will perform better for Intel than it does for Bulldozer.

--- End quote ---

That is not how it works. There are eight integer/branch units, and four floating point units.

Every two integer cores shares a FPU. There is no way in the code to "steer" FPU code to a specific FPU unit, it always goes to the one connected to the core your code is running on. From the point of view of the code it is as if both integer units have an associated FPU unit. It is only that the FPU is running at half speed.

If you mean that you run all floating point math in one thread, and all integer math in an other, and then have them assigned to two different cores in the module this means that one core will run at full speed (the one doing the integer math) while the other will be mostly waiting for the FPU core (the one doing only FPU work).

You get the same effect with hyperthreading, where unused parts of the core can run instructions from another thread while it is waiting for FPU or memory fetches, thus giving a higher utilization of the core.

And if you do rewrite your applications to split out all FPU calculations to a separate thread (even if that is possible without nightmarish thread inter-dependencies) you will only get better performance if you do half as much FPU as ALU calculations.

That is probably only true for a small set of applications.

And in all likelihood, unless the instruction scheduler in the bulldozer module is really bad, you will actually get worse performance, since if you do not split the threads that way the cores will both do some integer math, and both do some FPU work, and probably spend less time total waiting (when core1 is doing FPU work, core2 might be doing integer math (or, well, it will be doing that or it will be waiting)).

This low FPU performance is an issue in desktop applications, since the most CPU intensive ones tend to do a lot of FPU calculations (games, video compression, image processing, sound processing etc).

The exception being 7zip. :)

For a server this might make sense, since they tend to have a more integer based workload.

However, this is not always the case. As an example the 8-core AMD server CPU:s performed worse than the 6-core intel (low-power, running at ~1Ghz lower frequency) for our workload. In fact, hyperthreading gave more of a performance boost than AMD dual-core-per-module threading.

And the intel parts use one sixth of the power to do the work. And costs about the same.

kitamesume:
^ add in the fact that if the rig reaches kilowatts in power consumption(which are usually the case for stacked servers) being 25-75% more power efficient means 250-750watts less power consumption.
being power efficient also means less cooling problems, which also means the stock coolers would work even if its garbage, less total costs inshort.

Edit: can't really use the stock cooler of bulldozer when overclocking right? unlike intel's i5 could at least overclock to 4Ghz with stock cooler.

Navigation

[0] Message Index

[*] Previous page

Go to full version