The Bulldozer CPU is "special" with its module blocks. No HTT but 2 cores share 2 integer "clusters" and only 1 floating point unit.
https://de.wikipedia.org/wiki/AMD_FX#/media/File:AMD_Bulldozer_block_diagram_(8_core_CPU).PNG
A 4x thread FPU workload pinned on Core 0,2,4,6 could perform better...