### **Mansoura Engineering Journal**

Volume 19 | Issue 3 Article 2

9-1-2021

# Performance Study of the Modified Fast Serial Parallel Multiplier with RECO Technique.

#### Ali El-Desoky

Professor of Computer & Control Engineering Department., Faculty of Engineering., El-Mansoura University., Mansoura., Egypt.

#### Aida Abd El-Gawad

Assistant Professor of Computer & Control Engineering Department., Faculty of Engineering., El-Mansoura University., Mansoura., Egypt.

#### Yasser El-Sheshnagy

VTMS division Transit Department., Suez Canal Authority., Ismailia., Egypt.

Follow this and additional works at: https://mej.researchcommons.org/home

#### **Recommended Citation**

El-Desoky, Ali; Abd El-Gawad, Aida; and El-Sheshnagy, Yasser (2021) "Performance Study of the Modified Fast Serial Parallel Multiplier with RECO Technique.," *Mansoura Engineering Journal*: Vol. 19: Iss. 3, Article 2.

Available at: https://doi.org/10.21608/bfemu.2021.164285

This Original Study is brought to you for free and open access by Mansoura Engineering Journal. It has been accepted for inclusion in Mansoura Engineering Journal by an authorized editor of Mansoura Engineering Journal. For more information, please contact mej@mans.edu.eg.

## PERFORMANCE STUDY OF THE MODIFIED FAST SERIAL PARALLEL MULTIPLIER WITH RECO TECHNIQUE

معدل الأداء تضارب التوالى التوازى السريع المعدل المعالج بطريقة اعادة الحساب بالإزاحة الدانرية للمتغيرات

Prof. Dr. Ali Ibrahim El-Desoky Dr. Aida Othman Abd El-Gawad Computer & Control Dept., Faculty of Engineering El-Mansoura University, El-Mansoura, Egypt.

Eng. Yasser Hassan El-Sheshnagy
VTMS division Transit Department, Suez Canal Authority, Ismailia, Egypt.

يقوم البحث بدراسة معدل أداء المسرعة و المكونات المادية لضارب التوالى - التوازى السريع المعدل المعالج بطرق اكتشاف الأخطاء المتشاركة في زمن الحدوث مع العمليات و المطبق عليه طريقة اعادة الحساب بالإراحة الخائرية للمتغيرات . ضارب التوالى - التوازى السريع المعدل تم تقديمه أساسا لزبادة السرعة بإستخدام مجمع توازى سريع بدلا من مجمع التوازى السريع المعدل تم تقديمه أساسا لزبادة السرعة بإستخدام مجمع توازى سريع بدلا من مجمع التوازى في مضاعف التوالى - التوازى السريع و هذا الضارب يصلح الاستخام في التطبيقات ذات السرعة العالية في الدوائر المتكاملة ذو المقباس الكبير جدا عندما تكون مساحة الدائرة المتكاملة محدودة . بتطبيق طريقة إعادة الحساب بالإراحة الدائرية للمتغيرات المحتشاف الأخطاء المتشاركة في زمن الحدوث مع العمليات في ضارب التوالى - التوازى السريع المعدل فإن التوازن بين وجود المتشاف أخطاء صغير مع زيادة مناسبة في سرعة الضارب و استخدام مكونات مادي قليلة بقدر الإمكان أصبح متوقعا ، الضارب في هذه الحالة يصلح للدوائر المتكاملة ذو المقياس الكبير جدا و يمكن استخدامه بكفاءة في تطبيقات معالجة الإشارة الرقعية ، ثم أيضا عمل مقارضا السرعة و المكونات المادية مع ضارب الجمع و الإراحة مع تغزين المرحل باستخدام و بدون استخدام طربشة المسرعة و المكونات المادية المادية المعارب الجمع و الإراحة مع تغزين المرحل باستخدام و بدون استخدام طربشة الحساب بالإراحة الدائرية المتغيرات .

#### **ABSTRACT**

This paper studies the speed and hardware performance of the modified fast serial-parallel (MFSP) multiplier with concurrent error detection (CED) by REcomputing with Circularly shifted Operands (RECO) technique. The MFSP multiplier has been proposed primarily to increase the speed of the fast serial parallel (FSP) multiplier by using fast parallel adder instead of the parallel ripple carry adder. This multiplier can be used for VLSI when the chip size is limited for high speed applications. By using RECO technique for CED applied to the MFSP multiplier, the copmromise between suitable error coverage, small error latency, suitable multiplication speed improvement and low hardware over-head as possible is expected. This multiplier is suitable for VLSI and can be used efficiently In digital signal processing (DSP) applications. Speed and hardware comparisons to the carry save-add shift (CSAS) multiplier with and without RECO technique are also made.

#### 1. INTRODUCTION

Many important signal processing and communication algorithms use multiplication intensive operations [1-3] and owing to the fact that speed and cost are the most important design criteria, digital signal processing (DSP) engineers have explored speed/cost efficient techniques for computing these operations [4-5]. Taking into consideration circuit complexity, hardware area, pin limitations, execution speed, and data transfer type, the serial-parallel multiplie is most suitable for digital sig-nal processing. With the advance of VLSI technology, large integration of processing elements which can cooperate with each other to accomplish massive computation tasks, have resulted in higher speed. However, owing to the increasing comlexity and density of VLSI circuits, any failure of a chip may seriously affect all the operations of the system, and hence high computing reliability is required to ensure validity and integrity of computed results specially long computaions [6-9], such case appears in some critical fields such as radar communication and real time image processing for robotics control.

A new architecture of serial-parallel multiplier called Modified Fast Serial Parallel (MFSP) multiplier that can be used when the chip size is limited for high speed applications, i.e., the case in which a compromise between the speed, area and hardware overhead is considered is presented in [10]. This multiplier uses a 4-bits fast adder stages connected in cascade instead of the second row of parallel ripple carry adder of the fast serial parallel (FSP) multiplier [5]. Concurrent error detection (CED) by using REcomputing with Circularly shifted Operands (RECO) technique [9] applied to the MFSP multiplier has been proposed in [11]. It has been tried to make a compromise between suitable error coverage, small error latency, suitable multiplication speed improvement over that of the CSAS multiplier with RECO technique[9] and low hardware overhead as possible for this model. The error detection capabilty, the error performance and fault coverage are discussed in [11].

This paper studies the Speed and hardware performance of the MFSP multiplier with RECO technique and provides comparisons to the CSAS multiplier with and without RECO technique as well.

Fig. (1) shows a proposed implementation of 8-bits MFSP multiplier with RECO technique. It con-sists of two subcircuits, the upper is the Carry Save - Add Shift (CSAS) multiplier shown in fig. (2) which has been proposed in [9] and the other is n-bits Parallel fast adder shown in fig. (3) [11-13]. This multiplier acts as CSAS multiplier with RECO technique for the first 2n clock cycles and then reconfigures itself as n-bit fast parallel adder with RECO technique to add the sum and carry words residing in the multiplier structure. The error latency is about 2 clock cycles for the CSAS subcircuit of multiplier and few clock cycles for the parallel fast adder subcircuit depending on the size of the multiplier [11].

This paper makes a comparative study in order to test the validity of the MFSP multiplier with RECO technique against the CSAS multiplier with and without RECO technique. This comparison is based on speed and hardware factors for different sizes of multipliers.

#### 2. MULTIPLIERS SPEED COMPARISON

The multiplication speed is one of the most critical factors . the comparison because increasing the multiplication speed wi wide the spectrum of application of such multipliers especial when reliable multiplication results are required as one of t demands of the multiplier, since concurrent erro important detection techniques slower the normal multiplication speed of a multiplier and add some hardware overhead for error detection depending on error latency and error coverage.

The execution speeds of the two multipliers (CSAS with REG technique and MFSP with RECO technique) are tested technique and MFSP with RECO technique) are tested is calculating the time consumed by each multiplier. The time of the multiplication is used as an indication for the speed of the multiplier.

The two multipliers CSAS and MPSP with RECO technique produce as add the summands by the same way for the first 2n clock pulse: The difference between each techinque appear in the addition ( the sum and carry words which are present in the structure of the multiplier after the first 2n clock pulses. So that, the total time of the multiplication process can be calculated by t following formula :

= t + t () tot 1.

Where;

: is the total time of the multiplication process. t

: is the time taken through the first 2n clock pulses. t

: is the time taken to add the sum and the carry words t residing in the multiplier structure after the first 2n clock pulses.

The time taken through the first 2n clock pulses t is constar for the two multipliers and can be calculated from the following formula:

n : is the size of the multiplier in bits

t : is the time required to compelete single step of multiplicatio

for the primary computation , the clock interval t is given by

$$t = t + t + t + t + t$$
 (3

t = t + t + t + t
c AND shifter FA setup
for the recomputation step, the clock interval t is given by

$$t = t + t + t + t$$

R sw shifter FA comp

Where;

is the clock interval in primary computation. t

 $t_R$  is the clock interval in recomputation step.

t  $\hspace{0.1in}$  is the delay time of AND gate. AND

t is one full adder delay time.

FA

t is the switch gate delay time.

SW

t is the shifter delay time.

shifter

t is the setup time of the shifter.

setup

t is the equality checker delay time. comp

Assuming that t = t and t = t, then from eqn's(3) and(4) AND sw setup comp

$$t = t$$

$$c R$$

$$t = 2 t$$
(5)

Substituting from eqn (5) in eqn (2) giving :

$$t = 2n t \tag{6}$$

The value of t is not the same for the two multipliers because

it depends on the method of the addition of the carry and sum words residing in the multiplier structure after the first 2n clock pulses. This time is calculated for the two multiplier techniques, and then the total time is calculated for these techniques using equation (1).

In the CSAS multiplier with RECO technique, another 2n clock pulses are used to add the sum and carry words residing in the multiplier structure and to propagate the carry through the multiplier structure to produce the n MSB's of the product. So that the time t is given by the following equation:

$$t = n t$$

$$2$$

$$t = 2n t$$

$$2$$

$$C$$
(7)

Substituting from eqn's (6) and (7) into eqn (1) gives the total time T consumed by the CSAS multiplier to complete the CSAS multiplication process:

$$t = T = 4n t$$
tot CSAS c

The MFSP multiplier adds the sum and the carry words by using the second row of fast adders instead of using another 2n clock pulses. So the time t can be calculated as follows:

The computation interval for the fast adder is t and is given by:

$$t = t + t + t + t$$
 (9)  
PC shifter ADD sw setup

Where;

Mansoura Engineering Journal (MEJ) vol.19.no.4, December 1994 E.116

is the computation delay of fast adder. is the addition time of the fast adder. ADD

The recomputation interval t is:

thus

$$t = t$$
 (11) pc PR and

Where;

t is the recomputation delay of parallel adders

Since the delay of the 1-bit full adder circuit together with the storage and gating elements is given in (5,14) by :

$$t = \tilde{A} t$$

$$cell FA$$
(13)

Where A is a cofficient accounts for the margin required to ensure the adequate decay of the transient, so that the proper information is set into the storage cells at each clock pulse.

Observing that t is in fact equal to t [5,14], and substituting c from eqn (13) in eqn (8) gives :

$$t = T = 4n \text{ Å } t$$
tot CSAS FA

Also, it can be shown from eqn (3) and eqn (13) that :

$$t + t + t = (\tilde{A}-1) t$$
  
sw shifter setup FA

Substituting form eqn's (15) in (12) we get :

$$T = 2 \{(\tilde{A}-1) \ t + t \}$$
 (16)  
2 FA ADD

The MPSP multiplier uses 4-bits fast adder stages connected in cascade for the second row of parallel adder and t is

calculated by :

t = 
$$2 \text{ §} + 3 \text{ §} \cdot \text{n/4}$$
 (17)  
ADD  
where § is one gate delay

From eqn's (17) and (18) it is found that :

$$t = (1 + 0.375 \text{ n}) t$$
 (19)

Substitution from eqn (19) in eqn (16), giving t for the MFSP multiplier with RECO technique

$$t = 2 (\bar{A} + 0.375 n) t$$
 (20)

From substitution of eqn's (6) and (20) into eqn (1), it gives the total time T consumed by the MFSP multiplier with RECO technique MFSP

to complete the multiplication process :

$$T = T = T + T$$
  
tot MFSP 1 2  
= { n (2 Å + 0.75 ) + 2 Å } t  
FA (21)

Using eqn's(14) and(21), the time consumed by the two multipliers with RECO technique for  $\tilde{A}=2$  and  $\tilde{A}=4$  can be calculated in terms of t as shown in fig.(4).Fig.(5) shows the speed comparison FA

of these multipliers with RECO technique to the ordinary nonredundant CSAS multiplier.

From these figures, it can be shown that,

- 1- The speed of the CSAS multiplier with RECO technique is about 50% of the speed of the nonredundant CSAS multiplier for both A=2 and A=4.
- 2- The speed of the MFSP multiplier with RECO technique of size n=64 bits is about 83.12% and 90.14% of the speed of the nonredundant CSAS multiplier for  $\tilde{A}=2$  and  $\tilde{A}=4$ , respectivel.
- 3- The speed improvement of the MFSP with RECO technique increases as n increases largely, and it can be shown that, the speed of MFSP multiplier with RECO technique is about 84.21% and 91.43% of the speed of the nonredundant CSAS multiplier for  $\tilde{A}=2$  and  $\tilde{A}=4$  respectively.

#### 3. MULTIPLIERS HARDWARE COMPARISON

The hardware of the multiplier is an important factor in choosing the multiplier especially when concurrent error detection techniques are used because they add some hardware overhead for error detection. An increase in the hardware leads to increase in the power consumption and area of multiplier, thus increasing the cost of the multiplier. Also, the hardware may be a limit factor for the production of the multiplier in a single chip when the chip area is limited.

The gate count [1,15] is used to compare between the hardware of the CSAS and MFSP with RECO technique because the power consumption [1] and the size [15] of the multiplier depends on the number of logic gates used in the multiplier circuit. Also,

Mansoura Engineering Journal (MEJ) vol.19, ng.4, December 1994 E.118

it is a good factor of cost estimation of multiplier circuits. Under the assumption that the fan-in of each logic gates i restricted in a certain constant, but fan-out is not restricted also for simplicity in evaluation, it is assumed that all logi gates (AND, OR, NAND, NOR) have the same hardware complexity. Als the area of a circuit is defined by the area of the minimul rectangular region on a plane including the layout of the circui [15].

The two multipliers consists of one or more of the followin circuits:

i - AND gate

ii - One bit full adder [10, 13]

iii - One bit latch [16]

iv - 2 to 1 line multiplexer [16]

v - 4-bit fast adder stage [10, 12, 13]

vi - switching gate

vii - TSC-XOR equality checker [17, 18]

Table (1) shows the gate count of each of these circuits.

For different number of bits, the hardware comparison for the two multipliers in terms of the gate count is shown in fig. (6). I can be shown that:

- 1- The hardware of the CSAS multiplier with RECO technique i about 200% of the hardware of the nonredundant CSAS multiplier.
- 2- The hardware of the MFSP multiplier with RECO technique i: about 270% of the hardware of the nonredundant CSAS multiplier
- 3- The hardware of the MFSP multiplier with RECO technique is about 135% of the CSAS multiplier with RECO technique.

Table (2), Fig (7) and fig. (8) shows the speed improvement and the hardware overhead of both of the CSAS, MPSP multipliers without and with RECO technique.

#### 4. CONCLUSION

This paper studies the speed and hardware performance of the modified fast serial parallel (MFSP) multiplier with RECC technique which has been proposed in [11]. Comparison study is made between this multiplier and the CSAS multiplier with RECC technique. The results reval that:

- 1- The MFSP multiplier with RECO technique has about 15.79% and 8.57% under that of the nonredundant CSAS multiplier for A=2 and A=4, respectively with 170% increase in hardware.
- 2- The MFSP multiplier with RECO technique has about 34.21% and 41.53% of nonredundant CSAS multiplier speed for Å =2 and Å = 4, respectively over that of the CSAS multiplier with RECC technique with about 35% CSAS multiplier with RECO technique hardware overhead.

#### REFERENCES

 S. Waser: 'High-speed monolithic multipliers for real-time digital signal processing', Computer, vol 11, pp. 19-29, Oct. 1978.

- [2] P. E. Danielsson: 'Serial/ parallel convolvers', IEEE Trans. Comput., vol C-33, pp 652-667, Jul. 1984.
- [3] N. Kanopoulos: 'A bit-serial architecture for digital signal processing', IEEE Trans., vol CAS-32, pp 289-291, Mar. 1985.
- [4] R. F. Lyon : 'Two's complement pipeline multipliers', IEEE Trans. Commun. vol COM-12, pp 418-425, Apr. 1976.
- [5] R. Gnanasekaran : 'A fast serial-parallel binary multiplier', IEEE Trans. Comput., vol c-34, pp 741-744, Aug. 1985.
- [6] J. H. Patel and L. Y. Fung: 'Concurrent error detection in ALU's by recomputing with shifted operands', IEEE Trans. Comput., vol c-31, pp 589-595, Jul. 1982.
- [7] A.O.ABD EL-Gawad: Fault Tolerant Architecture For Serial-Parallel Multipliers ', Advances In Modelling & Analysis, A, AMSE Press. vol.20, No.3, PP. 37-53, 1994.
- [8] Thomas J. Brosnan and Noel R. Strader: 'Modular detection for bit-serial multiplication', IEEE Trans. Compu., vol 37, pp. 1043-1052 Sep. 1988.
- [9] L. G. Chen and T. H. Chen : 'Fault tolearant serial parallel multiplier', IEE proc. E, Comput. Digit. Tech., vol. 138, pp. 276-280, Jul. 1991.
- [10] A. I. El-Desoky , A. O. Abd El-Gawad and Y. H. El-Sheshnagy: 'New hardware approach for binary multipliers', advances in modelling & analysis , AMSE Press , France , A , vol.25 N1 , pp. 27-36 , 1995.
- [11] A. I. El-Desoky , A. O. Abd El-Gawad and Y. H. El-Sheshnagy: 'Concurrent error detection in ser-ial parallel multipliers', advances in moddleing & analysis , AMSE Press , France , A , vol.25 , N1 , pp. 37-46 ,1995.
- [12] R. M. M. Oberman: 'Digital circuits for binary arithmetic', MacMillan press 1td, 1979.
- [13] Staff of Texas instruments: The TTL data book for design engineers' Texas Instruments Inc. 1983.
- [14] A. El-desoky: 'A new technique for binary multipliers', 15 th International Conference for Statistic, Computer, Science, Social, and Demographic research, pp 89-101, 1990.
- [15] N. Takagi, H. Tasuura and S. Yajima: 'High speed VLSI multiplication algorithm with a redund-ant binary tree', IEEE Trans. Comput., vol c-34, pp 789-796, Sept. 1985.
- [16] M. M. Mano :'Digital design', prentice hall, 1984.
- [17] J. E. Smith and P. Lam: 'A theory of totally self-checking system design', IEEE Trans. Comput., vol. c-32, pp. 831-844, Sep. 1983.
- [18] N. Gaitanis: 'A totally self checking error indicator', IEEE Trans. Comput., vol C-34, pp 758-761, Aug. 1985.



Fig. (1) Modified Fast Serial Parallel (MFSP) multiplier with RECO technique



Fig (2) Geery-Same 266-Shift serial parallel unltiplier with 1200 techinque



Fig. (3) AND-OR realization of 4-bits fast adder



Table (1) Gate count of multiplier's components

| Circuit                  | no. of gates |
|--------------------------|--------------|
| AND gate                 | 1            |
| One bit full adder       | 9            |
| One bit latch            | 6            |
| 2 to 1 line multiplexer  | 3            |
| 4-bits fast adder        | 38           |
| switching gate           | 1            |
| TSC-XOR equality checher | 6            |
| 372323                   | -            |



Fig. (6) Hardware comparison for the two multipliers