[6091] in Athena Bugs
RT libc.a ldiv$$()
daemon@ATHENA.MIT.EDU (John Carr)
Sat Sep 22 09:21:40 1990
To: bugs@ATHENA.MIT.EDU
Date: Sat, 22 Sep 90 09:21:27 EDT
From: John Carr <jfc@ATHENA.MIT.EDU>
ldiv$$() is the function used for integer division. It was written for
the old model RT, which took 3 CPU cycles to generate each bit of the (32
bit) result. The code that was put in to minimize the number of divide
instructions no longer helps, because the new model RTs take only 1 CPU
cycle to generate each bit. The following change speeds up division by
about 20%.
*** /source/bsd-4.3/rt/lib/libc/ca/gen/ldiv.s Tue Sep 19 14:20:19 1989
--- ldiv.s Sat Sep 22 09:03:59 1990
***************
*** 59,81 ****
ENTRY(ldiv$$)
cli r3,1 # special-case 0 and 1 divisors
jnh easy
! st r4,REG_OFFSET+16(r1)
! mr r4,r2 # determine no. of divide steps needed
! shra r4,16
! bmx 0f # all 32
! clz r4,r4 # upper half bitcount
! jnz 1f
! clz r4,r2 # lower half bitcount,
! ai r4,16 # augmented
! 1: dec r4,1 # dividing one extra position avoids
! # special-casing 0 and negative nos
! sl r2,r4 # left-justify dividend
! 0: mts %mq,r2
! get r0,$here
! a r0,r4
! a r0,r4
! brx r0
! shra r2,31 # extend dividend sign through hi reg
here: # 32 divide steps:
d r2,r3
d r2,r3
--- 59,65 ----
ENTRY(ldiv$$)
cli r3,1 # special-case 0 and 1 divisors
jnh easy
! mts %mq,r2
here: # 32 divide steps:
d r2,r3
d r2,r3
***************
*** 117,125 ****
mfs %mq,r0 # (retrieve the quotient)
a r2,r3 # correction: complete the final restore
! 1: sl r0,r4 # sign-extend the (32-r4)-bit quotient
! sra r0,r4
! cis r2,0 # adjust quotient up, remainder down ...
je 9f # if remainder's nonzero and ...
c r2,r3
je 7f # remainder == divisor,
--- 101,107 ----
mfs %mq,r0 # (retrieve the quotient)
a r2,r3 # correction: complete the final restore
! 1: cis r2,0 # adjust quotient up, remainder down ...
je 9f # if remainder's nonzero and ...
c r2,r3
je 7f # remainder == divisor,
***************
*** 130,138 ****
#return to the calling routine
9: mr r3,r2
- mr r2,r0
brx r15
! l r4,REG_OFFSET+16(r1)
# divisor is zero (cc low) or one (cc equal).
--- 112,119 ----
#return to the calling routine
9: mr r3,r2
brx r15
! mr r2,r0
# divisor is zero (cc low) or one (cc equal).