**Post: #1**

1459241059-Chapters16REFPublicationIndex1.docx (Size: 1.36 MB / Downloads: 5)

INTRODUCTION

1.1 Overview

As many of know that the Computation unit is main unit of any science and technology, which perform different arithmetic operation like as addition, subtraction and multiplication etc. also in some places it perform logical operation such as ADD, OR, INVERT, X-OR etc. Multiplexer are extensively used in Microprocessor, DSP and Communication application. For higher order multiplication, a huge number of adders are used to perform the partial product addition. The need of high speed processor is increasing. Higher throughput arithmetic operation is important to achieve the desired performance in many real time signal and image processing application. Among these units the performance of any processor majorly depends on the time taken by ALU to perform the specified operation. The important fundamental function in arithmetic operation is multiplication. The multiplication based operation such as Multiply and Accumulate (MAC) and inner product are among some of frequently used Computation Intensive Arithmetic functions (CIAF) currently implemented in many Digital Signal Processing (DSP) application such as convolution, Fast Fourier Transform (FFT), filtering and in microprocessors in its arithmetic and logic unit. [16]

As the multiplication operation manage the execution time of most DSP algorithm, so there is a need of high speed multiplier for execution. The core computing process is always a multiplication routing; therefore engineers are constantly looking for new algorithm and hardware to implement technology. The demand for high speed processing day by day has been increasing as a result of expanding computer and signal processing application. Higher throughput arithmetic operation is important to achieve the desired performance in many real- time signal and image processing application. [15] One of the key arithmetic operations in such application is multiplication and the development of fast multiplier circuit has been a subject of interest over decades. [20] Reducing time delay and power consumption are very essential requirement for many application. In any processor the major unit is control unit, ALU and memory read write. ALU is an execution unit which not only performs the arithmetic operation but also logical operation and therefore ALU is called a heart of Microprocessor, Microcontroller, and CPUs. No technology uses works upon that operation either fully or partially which are performs by ALU. The block diagram of ALU given below in Figure: 1.1 where ALU has been implemented on FPGA.

Here the input interface to access ALU module is input switches on FPGA board, and after processing on the data the result can be seen from LCD output of FPGA.

The speed of multiplication operation is of great importance in DSP as well as in general processor. In the past multiplication was implemented generally with a sequence of addition, subtraction and shift operations. There have been many algorithms proposals in literature to perform multiplication, each offering different advantages and having tradeoff in terms of speed, circuit complexity, area and power consumption. The multiplier is a fairly large block of a computing system. The amount of circuitry involved is directly proportional to the square of its resolution i.e. a multiplier of size n bits has n2 gates. For multiplication algorithms performed in DSP applications latency and throughput are the two major concerns from delay perspective. Latency is the real delay of computing a function, a measure of how long the inputs to a device are stable is the final result available on outputs. Throughput is the measure of how many multiplications can be performed in a given period of time; multiplier is not only a high delay block but also a major source of power dissipation. That’s why if one also aims to minimize power consumption, it is of great interest to reduce the delay by using various delay optimizations.

1.2 Problem Statement

As the two most common multiplication algorithms followed in the digital hardware are the array multiplication algorithm and Booth multiplication algorithm. The computation time taken by the array multiplier is comparatively less because the partial products are calculated independently in parallel, but the delay associated with the array multiplier is the time taken by the signals to propagate through the gates that form the multiplication array.

Booth multiplication is another important multiplication algorithm. Large booth arrays are required for high speed multiplication and exponential operations which in turn require large partial sum and partial carry registers. Multiplication of two n-bit operands using a radix-4 booth recording multiplier requires approximately n / (2m) clock cycles to generate the least significant half of the final product, where m is the number of Booth recorder adder stages. Thus, a large propagation delay is associated with this case. Due to the importance of digital multipliers in DSP, it has always been an active area of research and a number of interesting multiplication algorithms have been reported in the literature

1.3 Thesis Objectives

The main objective of this work is to implement an Arithmetic Unit with high speed performance which makes use of Vedic Mathematics Algorithm for multiplication. The Arithmetic Unit designed will perform various operations such as addition, subtraction, and multiply accumulate operations. The MAC unit implemented in the arithmetic module uses fast multiplier designed by using Vedic Mathematics algorithm to reduce the multiplication time required and improve the overall performance of the arithmetic operations.

1.4 Thesis Organization

The basic concept of Vedic mathematics, multiplications, architectures and functionality of Vedic multiplier has been discussed in chapter 2.

Chapter 3 describes the design of previous different modules of the Arithmetic unit.

Chapter 4 describes the design of different modules of the Arithmetic unit. These modules are coded using simulation software ISE (Integrated Software Environment) of Xilinx.

Simulation results obtained from realization of high performance Arithmetic Unit using Vedic mathematics has been shown and verified in chapter 5.

A conclusion has been made by these results and future scope of the thesis work has been discussed in chapter 6.

1.5 Tools Used

Software Used: Xilinx ISE (Integrated system environment) 13.1 has been used for the synthesis and simulation of the design.

Hardware Used: Vertex 6 (Family), XC3S500 / XC3S1600 (Device), FG320 / FG484 (Package), -5 (Speed Grade) FPGA devices.

REVIEW OF LITERATURE SURVEY

2.1 History of Vedic Mathematics

Vedic mathematics is part of four Vedas (books of wisdom). It is part of Sthapatya- Veda (book on civil engineering and architecture), which is an upa-veda (supplement) of Atharva Veda. It covers explanation of several modern mathematical terms including arithmetic, geometry (plane, co-ordinate), trigonometry, quadratic equations, factorization and even calculus.

His Holiness Jagadguru Shankaracharya Bharati Krishna Teerthaji Maharaja (1884-1960) comprised all this work together and gave its mathematical explanation while discussing it for various applications. Swahiji constructed 16 sutras (formulae) and 16 Upa sutras (sub formulae) after extensive research in Atharva Veda. Obviously these formulae are not to be found in present text of Atharva Veda because these formulae were constructed by Swamiji himself. Vedic mathematics is not only a mathematical wonder but also it is logical. That’s why VM has such a degree of eminence which cannot be disapproved. Due these phenomenal characteristic, VM has already crossed the boundaries of India and has become a leading topic of research abroad. VM deals with several basic as well as complex mathematical operations. Especially, methods of basic arithmetic are extremely simple and powerful [10, 18].

The beauty of Vedic mathematics lies in the fact that it reduces the otherwise cumbersome-looking calculations in conventional mathematics to a very simple one. This is so because the Vedic formulae are claimed to be based on the natural principles on which the human mind works. This is a very interesting field and presents some effective algorithms which can be applied to various branches of engineering such as computing and digital signal processing [12,18].

The multiplier architecture can be generally classified into three categories. First is the serial multiplier which emphasizes on hardware and minimum amount of chip area. Second is parallel multiplier (array and tree) which carries out high speed mathematical operations. But the drawback is the relatively larger chip area consumption. Third is serial- parallel multiplier which serves as a good trade-off between the times consuming serial multiplier and the area consuming parallel multipliers.

2.2 Algorithms of Vedic Mathematics

The word Vedic is derived from the word “Veda” which means the store-house of all knowledge. Vedic mathematics is mainly based on 16 Sutras (or aphorisms) dealing with various branches of mathematics like arithmetic, algebra, geometry etc. These Sutras along with their brief meanings are enlisted below alphabetically.

1) (Anurupye) Shunyamanyat – If one is in ratio, the other is zero.

2) Chalana-Kalanabyham – Differences and Similarities.

3) Ekadhikina Purvena – By one more than the previous One.

4) Ekanyunena Purvena – By one less than the previous one.

5) Gunakasamuchyah – The factors of the sum is equal to the sum of the factors.

6) Gunitasamuchyah – The product of the sum is equal to the sum of the product.

7) Nikhilam Navatashcaramam Dashatah – All from 9 and last from 10.

8) Paraavartya Yojayet – Transpose and adjust.

9) Puranapuranabyham – By the completion or non completion.

10) Sankalana- vyavakalanabhyam – By addition and by subtraction.

11) Shesanyankena Charamena – The remainders by the last digit.

12) Shunyam Saamyasamuccaye – When the sum is the same that sum is zero.

13) Sopaantyadvayamantyam – The ultimate and twice the penultimate.

14) Urdhva-tiryakbhyam – Vertically and crosswise.

15) Vyashtisamanstih – Part and Whole.

16) Yaavadunam – Whatever the extent of its deficiency.

These methods and ideas can be directly applied to trigonometry, plain and spherical geometry, conics, calculus (both differential and integral), and applied mathematics of various kinds. As mentioned earlier, all these Sutras were reconstructed from ancient Vedic texts early in the last century. Many Sub-sutras were also discovered at the same time, which are not discussed here.

2.2.1 Vedic Multiplication

The proposed Vedic multiplier is based on the Vedic multiplication formulae (Sutras). These Sutras have been traditionally used for the multiplication of two numbers in the decimal number system. In this work, we apply the same ideas to the binary number system to make the proposed algorithm compatible with the digital hardware. Vedic multiplication based on some algorithms, some are discussed below:

2.2.1.1 Urdhva Tiryakbhyam Sutra

The multiplier is based on an algorithm Urdhva Tiryakbhyam (Vertical & Crosswise) of ancient Indian Vedic Mathematics. Urdhva Tiryakbhyam Sutra is a general multiplication formula applicable to all cases of multiplication. It literally means “Vertically and crosswise”. It is based on a novel concept through which the generation of all partial products can be done with the concurrent addition of these partial products. The parallelism in generation of partial products and their summation is obtained using Urdhava Triyakbhyam explained in fig 2.1. The algorithm can be generalized for n x n bit number. Since the partial products and their sums are calculated in parallel, the multiplier is independent of the clock frequency of the processor.

Thus the multiplier will require the same amount of time to calculate the product and hence is independent of the clock frequency. The net advantage is that it reduces the need of microprocessors to operate at increasingly high clock frequencies. While a higher clock frequency generally results in increased processing power, its disadvantage is that it also increases power dissipation which results in higher device operating temperatures. By adopting the Vedic multiplier, microprocessors designers can easily circumvent these problems to avoid catastrophic device failures. The processing power of multiplier can easily be increased by increasing the input and output data bus widths since it has a quite a regular structure. Due to its regular structure, it can be easily layout in a silicon chip.

The Multiplier has the advantage that as the number of bits increases, gate delay and area increases very slowly as compared to other multipliers. Therefore it is time, space and power efficient. It is demonstrated that this architecture is quite efficient in terms of silicon area/speed.

To illustrate this multiplication scheme, let us consider the multiplication of two decimal numbers (5498 × 2314). The conventional methods already know to us will require 16 multiplications and 15 additions.

An alternative method of multiplication using Urdhva tiryakbhyam Sutra is shown in Figure 2.1. The numbers to be multiplied are written on two consecutive sides of the square as shown in the figure. The square is divided into rows and columns where each row/column corresponds to one of the digit of either a multiplier or a multiplicand. Thus, each digit of the multiplier has a small box common to a digit of the multiplicand. These small boxes are partitioned into two halves by the crosswise lines.

Each digit of the multiplier is then independently multiplied with every digit of the multiplicand and the two-digit product is written in the common box. All the digits lying on a crosswise dotted line are added to the previous carry. The least significant digit of the obtained number acts as the result digit and the rest as the carry for the next step. Carry for the first step (i.e., the dotted line on the extreme right side) is taken to be zero.