From: jac1 (jac1 at domain student.cs.ucc.ie)
Date: Thu 16 Aug 2001 - 16:24:12 IST
>Branch prediction performance is dependant on CPU hardware and choice of
>algorithm, but can be more efficient than delay slot instruction execution.
>> is it something to do with groupings, eg having to make sure by hand
>> that dependent instructions are far enough 'away' from the previous
>> instruction to not be executed at the same time, ie every set of,
>> say, 2 ops are executed in parallel???
>no, but funnily enough alpha cpus do something like this with 'packets' of
>or four instructions that get issued simultaneously to parallel pipelines.
>another story :)
Pentiums are dual-pipelined aren't they?
To make matters worse, some instructions (i387) can only be executed on a
certain pipeline (the v one IIRC)! Fairly headwrecking stuff if you're to do
it by hand, which
is the main reason hardly anything, bar hardware stuff, is done in asm.
Compilers do a good enough job of it.
This archive was generated by hypermail 2.1.6 : Thu 06 Feb 2003 - 13:11:42 GMT