Adding static data dependence collapsing to a high-performance instruction scheduler
State-of-the-art processors achieve high performance by executing multiple instructions in parallel. However, the parallel execution of instructions is ultimately limited by true data dependencies between individual instructions. The objective of this paper is to present and quantify the benefits of static data dependence collapsing, a non-speculative technique for reducing the impact of true data dependencies on program execution time. Data dependence collapsing involves combining a pair of instructions when the second instruction is directly dependent on the first. The two instructions are then treated as a single entity and are executed together in a single functional unit that is optimised to handle functions with three input operands instead of the traditional two inputs. Dependence collapsing can be accomplished either dynamically at run time or statically at compile time. Since dynamic dependence collapsing has been studied extensively elsewhere, this paper concentrates on static dependence collapsing. To quantify the benefits of static dependence collapsing, we added a new dependence collapsing option to the Hatfield Superscalar Scheduler (HSS), a state-of-the-art instruction scheduler that targets the Hatfield Superscalar Architecture (HSA). We demonstrate that the addition of dependence collapsing to HSS delivers a significant performance increase of up to 15%. Furthermore, since HSA already executes over four instructions in each processor cycle without dependence collapsing, dependence collapsing enables 0.4 additional instructions to be executed in each processor cycle.