Compare and contrast papers: Vlsi Implementation of Array Based Fir Filter Folding Essays

Vlsi Implementation of Array Based Fir Filter Folding Essays Vlsi Implementation of Array Based Fir Filter Folding Essay Vlsi Implementation of Array Based Fir Filter Folding Essay We are grateful to our Principal, Prof. K. Venkataramani for his support and direction in the course of the project. We take great pleasure in thanking our Head of the Department, Dr. S. Jayashri who has always been a source of inspiration. Her constant motivation has been a driving force for the successful completion of the project. This project was made possible due to the proficient and prompt guidance given by our Project Guide, Mr. J. Selvakumar. We take this opportunity to express our gratitude for the encouragement he has provided us. We are indebted to him for spending his valuable time with us.We thank our project coordinator, Mrs. M. Susila for conducting periodic reviews and giving us valuable suggestions. We also thank the lab technicians for their help and cooperation. ABSTRACT This project aims to implement finite impulse response (FIR) filter based on multiplier arrays in Very Large Scale Integration (VLSI) and intends to show the reduction of the hardware complexity t hat result out of folding techniques. FIR filter being one of the fundamental components of digital signal processing (DSP) has a vital role to play in communication and signal processing.The advantages of FIR filter are stability and easy implementation but these are undermined by its hardware complexity due to large number of filter-taps. Thus, processes such as folding are used to reduce the hardware complexity of FIR filters because they involve repetitive multiplications. This project deals with the implementation of an 8 tap FIR filter in unfolded, folded and two-stage cascaded folded filter. Cascading combines the merits of folded and unfolded schemes. The filters are implemented with four multipliers- Braun array, Ripple carry, carry save and Wallace tree.The performance of the structures with the four multipliers is compared in terms of hardware complexity and combinational path delay. The advantages of VLSI such as low cost, low power, high reliability, small size and high functionality are to be exploited in this project. The hardware descriptive language used is verilog HDL. TABLE OF CONTENTS CHAPTER NO. TITLE PAGE NO. ABSTRACT LIST OF TABLES LIST OF FIGURES LIST OF SYMBOLS AND ABBREVIATIONS iv x xi xiii 1. INTRODUCTION 1 1. 1 DIGITAL SIGNAL PROCESSING 1. 1. 1 Analog and digital signals 1. 1. 2 Signal processing 1. 1. 3 Digital signal processors 1. 1. Applications of DSP 1 1 1 2 2 2. LITERATURE OVERVIEW 2. 1 2. 2 2. 3 FILTERS ANALOG FILTERS DIGITAL FILTERS 2. 3. 1 Advantages of digital filter 2. 3. 2 Operation of digital filter 2. 3. 3 FIR and IIR filters 2. 3. 4 FIR filter 2. 3. 4. 1 Terms used in FIR filter 4 4 4 4 5 6 7 7 7 2. 3. 4. 2 Advantages of FIR filter 2. 3. 4. 3 Disadvantages of FIR filter 2. 4 FOLDING 8 9 9 3. ARCHITECTURES OF FIR FILTER 10 3. 1 3. 2 UNFOLDED ARCHITECTURE FOLDED ARCHITECTURE OF K TAP FIR FILTER 10 11 3. 3 CASCADED ARCHITECTURE OF FIR FILTER 13 4. MULTIPLIERS 4. 1 4. 2 BASICS OF DIGITAL MULTIPLICATION ARRAY MULTIPLIER 4. . 1 Braun array multiplier 4. 2. 2 Ripple carry multiplier 4. 2. 3 Carry save multiplier 4. 3 TREE MULTIPLIER 4. 3. 1 Wallace tree multiplier 16 16 16 17 18 19 20 20 5. SOURCE CODE 5. 1 UNFOLDED FILTER 5. 1. 1 Top module 5. 1. 2 16 bit adder 5. 1. 3 17 bit adder 5. 1. 4 18 bit adder 5. 1. 5 D flip flop 5. 1. 6 Multiplier 5. 2 FOLDED FILTER 5. 2. 1 Top module 5. 2. 2 Adder module 5. 2. 3 D-R module 5. 2. 4 D flip flop 22 22 22 23 23 23 23 24 24 24 25 25 26 5. 2. 5 Multiplier adder unit 5. 2. 6 Multiplier 5. 2. 7 Multiplexer 5. 2. 8 C-R module 5. 3 CASCADED FIR FILTER 5. 3. 1 Top module 5. 3. Adder module 5. 3. 3 D-R module 5. 3. 4 D flip flop 5. 3. 5 Filter stage 1 26 27 28 28 28 28 29 29 29 30 31 32 33 34 34 35 35 35 36 37 39 40 40 40 5. 3. 6 Filter stage 2 5. 3. 7 Multiplier adder unit 1 5. 3. 8 Multiplier adder unit 2 5. 3. 9 Multiplier 5. 3. 10 Multiplexer 5. 3. 11 C-R module 5. 4 MULTIPLIERS 5. 4. 1 Braun array multiplier 5. 4. 2 Carry save multiplier 5. 4. 3 Ripple carry multi plier 5. 4. 4 Wallace tree multiplier 5. 5 ADDERS 5. 5. 1 Full adder 5. 5. 2 Half adder 6. SIMULATION RESULTS 6. 1 6. 2 6. 3 UNFOLDED FOLDED CASCADED 41 41 42 43 7. XILINX SYNTEHSIS AND POWER REPORT: 7. UNFOLDED FIR FILTER 7. 1. 1 Synthesis report 7. 1. 2 Power report 44 44 44 46 7. 2 FOLDED FIR FILTER WITH BRAUN ARRAY MULTIPLIER 7. 2. 1 Synthesis report 7. 2. 2 Power report 47 47 49 7. 3 FOLDED FIR FILTER WITH CARRY SAVE MULTIPLIER 7. 3. 1 Synthesis report 7. 3. 2 Power report 49 49 52 7. 4 FOLDED FIR FILTER WITH RIPPLE CARRY MULTIPLIER 7. 4. 1 Synthesis report 7. 4. 2 Power report 52 52 54 7. 5 FOLDED FIR FILTER WITH WALLACE TREE MULTIPLIER 7. 5. 1 Synthesis report 7. 5. 2 Power report 55 55 57 7. 6 CASCADED FIR FILTER WITH BRAUN ARRAY MULTIPLIER 7. 6. 1 Synthesis report 7. 6. 2 Power report 58 58 60 . 7 CASCADED FIR FILTER WITH CARRY SAVE MULTIPLIER 7. 7. 1 Synthesis report 7. 7. 2 Power report 60 60 63 7. 8 CASCADED FIR FILTER WITH RIPPLE CARRY MULTIPLIER 7. 8. 1 Synthesis repor t 7. 8. 2 Power report 63 63 66 7. 9 FOLDED FIR FILTER WITH WALLACE TREE MULTIPLIER 7. 9. 1 Synthesis report 7. 9. 2 Power report 66 66 68 8. RTL SCHEMATICS 8. 1 8. 2 8. 3 UNFOLDED FIR FILTER FOLDED FIR FILTER CASCADED FIR FILTER 70 70 71 71 9. FPGA EDITOR DIAGRAMS: 9. 1 9. 2 9. 3 UNFOLDED FIR FILTER FOLDED FIR FILTER CASCADED FIR FILTER 72 72 73 74 10. COMPARISON 10. 1 COMPARISON OF MULTIPLIERS 10. 1. 1 Charts 10. COMPARISON OF THE THREE ARCHITECTURES OF FIR FILTER 10. 2. 1 Charts 10. 2. 2 Tables 75 75 75 76 76 77 11. CONCLUSION AND FUTURE ENHANCEMENTS 79 APPENDICES A B XILINX SPARTAN II FPGA FAMILY VERILOG 80 80 84 REFERENCES 87 LIST OF TABLES: S. NO TABLE TITLE PAGE NO 1. Table10. 1. Comparison of the gate count of architectures 77 2. Table10. 2. Comparison of the different structures 78 3. Table10. 3. Comparison of the gate count of the structures with different multipliers 78 LIST OF FIGURES: S. NO FIGURE TITLE PAGE NO 1. Fig2. 1. Filtering operation 4 2. Fig2. 2. Signal Proces sing system 5 . Fig2. 3. FIR filter 7 4. Fig3. 1. FIR filter in direct form 10 5. Fig3. 2. Folded architecture of k tap FIR filter 12 6. 7. Fig3. 3. Fig3. 4. Timing diagram 8-tap direct form FIR filter divided into two filter stages 12 13 8. Fig3. 5. Cascaded structure of two folded filter stages 14 9. Fig3. 6. Timing diagram 15 10. Fig4. 1. Multiplier cell 16 11. Fig4. 2. Braun Array Multiplier 17 12. Fig4. 3. Ripple Carry Array multiplier 18 13. Fig4. 5 Carry Save Array Multiplier 19 14. Fig4. 6. Flowchart of a tree multiplier 20 15. Fig4. 7. Transforming a partial product tree into a wallace tree 1 16. Fig4. 8. Wallace tree multiplier 21 17. Fig6. 1. Model Sim Output of unfolded FIR filter 41 18. Fig6. 2. Model Sim Output of folded FIR filter 42 19. Fig6. 3. Model Sim Output of cascaded FIR filter 43 20. 21. Fig 7. 1. Fig8. 1. Project properties RTL schematic of unfolded FIR filter 44 70 22. Fig8. 2. RTL schematic of folded FIR filter 71 23. Fig8. 3. RTL schematic of cascaded FIR filter 71 24. Fig9. 1 Fpga editor diagram of unfolded fir filter 72 25. Fig9. 2 Fpga editor diagram of folded fir filter 73 26. Fig9. 3 Fpga editor diagram of cascaded fir filter 74 27. Fig10. 1Combinational delay of the multipliers 75 28. Fig10. 2 Hardware complexity of the multipliers 75 29. Fig10. 3 Hardware complexity of the three architectures of fir filter 76 30. Fig10. 4 No of clock cycles to obtain output 76 31. Fig10. 5 Combinational delay 77 32. Fig10. 6 No of slices 77 LIST OF SYMBOLS AND ABBREVIATIONS: ADC : Analog to Digital Converter ALU : Arithmetic and Logic Unit ASIC : Application Specific Integrated Circuit BAM : Braun Array Multiplier CSM : Carry Save Multiplier DAC : Digital To Analog converter FA FFT FIR : Full Adder : Fast Fourier Transform : Finite Impulse ResponseFPGA : Field Programmable Gate Arrays GCLK : Global Clock HA : Half Adder HDL : Hardware Descriptive Language IEEE : Institute of Electrical and Electronics Engineers IIR IOB LSB : Infinite Impulse Response : Input Output Buffer : Least Significant Bit LUT : Look Up Table MAC : Multiply Accumulate MSB : Most Significant Bit RCM : Ripple Carry Multiplier RTL : Register Transfer Level VHDL : VHSIC Hardware Descriptive Language VLSI : Very Large Scale Integration 1. INTRODUCTION: 1. 1 DIGITAL SIGNAL PROCESSING: Digital Signal Processing, as the term suggests, is the processing of signals by digital means.A signal in this context can mean a number of different things. Historically the origins of signal processing are in electrical engineering, and a signal here means an electrical signal carried by a wire or telephone line, or perhaps by a radio wave. More generally, however, a signal is a stream of information representing anything from stock prices to data from a remote-sensing satellite. The term digital comes from digit, meaning a number (you count with your fingers your digits), so digital literally means numerical; the French word for digital is Ã¢â‚¬ËœnumeriqueÃ¢â‚¬â„¢.A d igital signal consists of a stream of numbers, usually (but not necessarily) in binary form. The processing of a digital signal is done by performing numerical calculations. 1. 1. 1 ANALOG AND DIGITAL SIGNALS: In many cases, the signal of interest is initially in the form of an analog electrical voltage or current, produced for example by a microphone or some other type of transducer. In some situations, such as the output from the readout system of a CD (compact disc) player, the data is already in digital form.An analog signal must be converted into digital form before DSP techniques can be applied. An analog electrical voltage signal, for example, can be digitized using an electronic circuit called an analog-to-digital converter or ADC. This generates a digital output as a stream of binary numbers whose values represent the electrical voltage input to the device at each sampling instant. 1. 1. 2 SIGNAL PROCESSING: Signals commonly need to be processed in a variety of ways. For ex ample, the output signal from a transducer may well be contaminated with unwanted electrical noise.The electrodes attached to a patients chest when an ECG is taken measure tiny electrical voltage changes due to the activity of the heart and other muscles. The signal is often strongly affected by mains pickup due to electrical interference from the mains supply. Processing the signal using a filter circuit can remove or at least reduce the unwanted part of the signal. Increasingly nowadays, the filtering of signals to improve signal quality or to extract important information is done by DSP techniques rather than by analog electronics.The development of digital signal processing dates from the 1960s with the use of mainframe digital computers for number-crunching applications such as the Fast Fourier Transform (FFT), which allows the frequency spectrum of a signal to be computed rapidly. These techniques were not widely used at that time, because suitable computing equipment was gene rally available only in universities and other scientific research institutions. 1. 1. 3 DIGITAL SIGNAL PROCESSORS: The introduction of the microprocessor in the late 1970s and early 1980s made it possible for DSP techniques to be used in a much wider range of applications.However, general-purpose microprocessors such as the Intel x86 family are not ideally suited to the numerically-intensive requirements of DSP, and during the 1980s the increasing importance of DSP led several major electronics manufacturers (such as Texas Instruments, Analog Devices and Motorola) to develop Digital Signal Processor chips specialized microprocessors with architectures designed specifically for the types of operations required in digital signal processing. Like a general-purpose microprocessor, a Digital Signal Processor is a programmable device, with its own native instruction code.DSP chips are capable of carrying out millions of floating point operations per second, and like their better-known g eneral-purpose cousins, faster and more powerful versions are continually being introduced. DSPs can also be embedded within complex system-on-chip devices, often containing both analog and digital circuitry. 1. 1. 4 APPLICATIONS OF DSP: DSP technology is nowadays commonplace in such devices as mobile phones, multimedia computers, video recorders, CD players, hard disc drive controllers and modems, and will soon replace analog circuitry in TV sets and telephones.An important application of DSP is in signal compression and decompression. Signal compression is used in digital cellular phones to allow a greater number of calls to be handled simultaneously within each local cell. DSP signal compression technology allows people not only to talk to one another but also to see one another on their computer screens, using small video cameras mounted on the computer monitors, with only a conventional telephone line linking them together. In audio CD systems, DSP technology is used to perform complex error detection and correction on the raw data as it is read from the CD.Although some of the mathematical theory underlying DSP techniques, such as Fourier and Hilbert Transforms, digital filter design and signal compression, can be fairly complex, the numerical operations required actually to implement these techniques are very simple, consisting mainly of operations that could be done on a cheap four-function calculator. The architecture of a DSP chip is designed to carry out such operations incredibly fast, processing hundreds of millions of samples very second, to provide real-time performance: that is, the ability to process a signal live as it is sampled and then output the processed signal, for example to a loudspeaker or video display. All of the practical examples of DSP applications mentioned earlier, such as hard disc drives and mobile phones, demand real-time operation. The major electronics manufacturers have invested heavily in DSP technology. Because they no w find application in mass-market products, DSP chips account for a substantial proportion of the world market for electronic devices.Sales amount to billions of dollars annually, and seem likely to continue to increase rapidly. 2. LITERATURE OVERVIEW: 2. 1 FILTERS: In signal processing, the function of a filter is to remove unwanted parts of the signal, such as random noise, or to extract useful parts of the signal, such as the components lying within a certain frequency range. The following block diagram illustrates the basic idea. Fig2. 1. Filtering operation There are two main kinds of filter, analog and digital. They are quite different in their physical makeup and in how they work. 2. ANALOG FILTERS: An analog filter uses analog electronic circuits made up from components such as resistors, capacitors and op amps to produce the required filtering effect. Such filter circuits are widely used in such applications as noise reduction, video signal enhancement, graphic equalizers i n hi-fi systems, and many other areas. There are well-established standard techniques for designing an analog filter circuit for a given requirement. At all stages, the signal being filtered is an electrical voltage or current, which is the direct analogue of the physical quantity (e. . a sound or video signal or transducer output) involved. 2. 3 DIGITAL FILTER: A digital filter is any electronic filter that works by performing digital math operations on an intermediate form of a signal. This is in contrast with older analog filters which work entirely in the analog realm and must rely on physical networks of electronic components (such as resistors, capacitors, transistors, etc. ) to achieve a desired filtering effect. Digital filters can achieve virtually any filtering effect that can be expressed as a mathematical algorithm. The two primary limitations of igital filters are their speed (the filter cant operate any faster than the computer at the heart of the filter), and their co st. However as the cost of integrated circuits have continued to drop over time, digital filters have become increasingly commonplace and are now an essential element of many everyday objects such as radios, cellphones, and stereo receivers. The analog input signal must first be sampled and digitized using an ADC (analog to digital converter). The resulting binary numbers, representing successive sampled values of the input signal, are transferred to the processor, which carries out numerical calculations on them.These calculations typically involve multiplying the input values by constants and adding the products together. If necessary, the results of these calculations, which now represent sampled values of the filtered signal, are output through a DAC (digital to analog converter) to convert the signal back to analog form. Note that in a digital filter, the signal is represented by a sequence of numbers, rather than a voltage or current. Fig2. 2 Signal processing system 2. 3. 1 A DVANTAGES OF DIGITAL FILTERS: The following list gives some of the main advantages of digital over analog filters. A digital filter is programmable, i. . its operation is determined by a program stored in the processors memory. This means the digital filter can easily be changed without affecting the circuitry (hardware). Redesigning the filter circuit can only change an analog filter. Digital filters are easily designed, tested and implemented on a general-purpose computer or workstation. The characteristics of analog filter circuits (particularly those containing active components) are subject to drift and are dependent on temperature. Digital filters do not suffer from these problems, and so are extremely stable with respect both to time and temperature.Unlike their analog counterparts, digital filters can handle low frequency signals accurately. As the speed of DSP technology continues to increase, digital filters are being applied to high frequency signals in the RF (radio freq uency) domain, which in the past was the exclusive preserve of analog technology. Digital filters are very much more versatile in their ability to process signals in a variety of ways; this includes the ability of some types of digital filter to adapt to changes in the characteristics of the signal. 2. 3. OPERATION OF DIGITAL FILTERS: Suppose the raw signal, which is to be digitally filtered, is in the form of a voltage waveform described by the function V = x (t) where t is time. This signal is sampled at time intervals h (the sampling interval). The sampled value at time t = ih is xi = x (ih) Thus the digital values transferred from the ADC to the processor can be represented by the sequence x0, x1, x2, x3, Corresponding to the values of the signal waveform at times t = 0, h, 2h, 3h, (where t = 0 is the instant at which sampling begins).At time t = nh (where n is some positive integer), the values available to the processor, stored in memory, are x0, x1, x2, x3, , xn Note that the sampled values xn+1, xn+2 etc. are not available, as they havent happened yet! The digital output from the processor to the DAC consists of the sequence of values y0, y1, y2, y3, , yn In general, the value of yn is calculated from the values x0, x1, x2, x3, , xn. The way in which the ys are calculated from the xs determines the filtering action of the digital filter. 2. 3. FIR AND IIR FILTERS: The impulse response of a digital filter is the output sequence from the filter when a unit impulse is applied at its input. (A unit impulse is a very simple input sequence consisting of a single value of 1 at time t = 0, followed by zeros at all subsequent sampling instants). An FIR filter is one whose impulse response is of finite duration. An IIR filter is one whose impulse response (theoretically) continues forever, because the recursive (previous output) terms feed back energy into the filter input and keep it going.Impulse Response The impulse response of a FIR filter is actually just the set of FIR coefficients. (If you put an impulse into a FIR filter which consists of a 1 sample followed by many 0 samples, the output of the filter will be the set of coefficients, as the 1 sample moves past each coefficient in turn to form the output. ) The term IIR is not very accurate, because the actual impulse responses of nearly all IIR filters reduce virtually to zero in a finite time. 2. 3. 4 FIRFILTER: Fig2. 3. FIR filter 2. 3. 4. 1 TERMS USED IN DESCRIBING FIR FILTERS: Tap A FIR tap is simply a coefficient/delay pair.The number of FIR taps, (often designated as N) is an indication of 1) the amount of memory required to implement the filter, 2) the number of calculations required, and 3) the amount of filtering the filter can do; in effect, more taps means more stop band attenuation, less ripple, narrower filters, etc. ) Multiply-Accumulate (MAC) In a FIR context, a MAC is the operation of multiplying a coefficient by the corresponding delayed data sample and acc umulating the result. FIRs usually require one MAC per tap. Most DSP microprocessors implement the MAC operation in a single instruction cycle.Transition Band The band of frequencies between pass band and stop band edges. The narrower the transition band, the more taps are required to implement the filter. (A small transition band results in a sharp filter. ) Delay Line The set of memory elements that implement the Z^-1 delay elements of the FIR calculation. Circular Buffer A special buffer which is circular because incrementing at the end causes it to wrap around to the beginning, or because decrementing from the beginning causes it to wrap around to the end.Circular buffers are often provided by DSP microprocessors to implement the movement of the samples through the FIR delay-line without having to literally move the data in memory. When a new sample is added to the buffer, it automatically replaces the oldest one. 2. 3. 4. 2 ADVANTAGES OF FIR FILTERS: Compared to IIR filters, FIR filters offer the following advantages: They can easily be designed to be linear phase (and usually are). Put simply, linear-phase filters delay the input signal, but donÃ¢â‚¬â„¢t distort its phase. They are simple to implement.On most DSP microprocessors, the FIR calculation can be done by looping a single instruction. They are suited to multi-rate applications. By multi-rate, we mean decimation (reducing the sampling rate), interpolation (increasing the sampling rate), or both. Whether decimating or interpolating, the use of FIR filters allows some of the calculations to be omitted, thus providing an important computational efficiency. In contrast, if IIR filters are used, each output must be individually calculated, even if it that output will be discarded (so the feedback will be incorporated into the filter).They have desirable numeric properties. In practice, all DSP filters must be implemented using finite-precision arithmetic, that is, a limited number of bits. The use of finiteprecision arithmetic in IIR filters can cause significant problems due to the use of feedback, but FIR filters have no feedback, so they can usually be implemented using fewer bits, and the designer has fewer practical problems to solve related to non-ideal arithmetic. They can be implemented using fractional arithmetic. Unlike IIR filters, it is always possible to implement a FIR filter using coefficients with magnitude of less than 1. . (The overall gain of the FIR filter can be adjusted at its output, if desired. ) This is an important consideration when using fixed-point DSPs, because it makes the implementation much simpler. FIR filters inherently stable. Since hey have no feedback elements, any bounded input results in a bounded output. 2. 3. 4. 3 DISADVANTAGES OF FIR FILTERS: Compared to IIR filters, FIR filters sometimes have the disadvantage that they require more memory and/or calculation to achieve a given filter response characteristic. Also, certain responses a re not practical to implement with FIR filters 2. FOLDING: Folding transformation is used to systematically determine the control circuits in DSP architectures where multiple algorithm operations such as addition operations are time-multiplexed to a single functional unit. Thus the number of functional units in the implementation is reduced resulting in an IC with low silicon area. This is an important aspect in synthesizing DSP architectures. In general, folding can be used to reduce the number of hardware functional units by a factor of N at the expense of increasing the computational time by a factor of N (number of algorithm operations executed on a single functional unit in hardware).Folding transformation may also lead to an architecture that uses a large number of registers. To overcome this drawback, techniques can be used to compute the minimum number of registers required to implement a folded DSP architecture and to allocate data to these registers. Using register minimiz ation techniques along with the folding transformation not only reduces the number of functional units but also keeps the area consumed by memory in the folded architecture to a minimum. 3. ARCHITECTURES OF FIR FILTER: The direct form structure consists of a large number of filter taps that lead to excessive hardware complexity.Folding techniques have been proposed as a means of reducing the hardware complexity when the processing throughput required by the application is less than the throughput at which the circuit can operate. FIR filters are ideal candidates for folding since they are essentially a repetition of multiplications. The main drawback of folded FIR filter scheme is that while they achieve significant hardware reduction, they also reduce the sample rate. A way to combine the merits of folded and unfolded filters is to cascade a number of folded FIR filter units.The partially folded filter is an intermediate form between the folded and unfolded form of a filter, featur ing higher throughput than the folded and requiring less hardware than the unfolded. Partially folded filter consists of a number of modules, each of them being a fully folded filter. Cascading p such modules increases the sample rate by p. 3. 1 DIRECT FORM ARCHITECTURE: The unfolded architecture consists of delays, multipliers and adders. The output is obtained every clock cycle. A k-tap filter consists of k multipliers, k-1 adders and k-1 delays.The input sample x is delayed and multiplied with filter coefficients and accumulated to get the output. Fig4. shows a k-tap direct FIR filter. Fig3. 1. FIR filter in direct form 3. 2 FOLDED ARCHITECTURE OF K-TAP FIR FILTER: The folded architecture consists of a multiplier and adder unit, which performs one multiplication, addition operation every clock cycle. So it requires k-clock cycles to perform k such operations in production of single output of a k-tap filter. The cyclic shift registers C-R store the filter coefficients in the desce nding order and k-1 input samples are stored in the D-R cyclic shift registers in the descending order.These shift registers account for the delay elements present in the direct form of the unfolded architecture of an FIR filter. The C-R registers correspond to the data latches where the filter coefficients are stored when the filter is programmed. Fig5. shows the architecture of the folded FIR filter. The convolution output is produced in k-clock cycles. The term x (n-k-1) hk-1 is computed first and the term x(n)h0 is computed last. The multiplexer set; mux1 is used to input the filter coefficients to the multiplier-add unit (mac).The select signal Ã¢â‚¬ËœselÃ¢â‚¬â„¢ is used to select between the two inputs, one being the external input Ã¢â‚¬ËœhcÃ¢â‚¬â„¢ and the other is the data out of the shift register C-R. During the first k clock cycles hc is selected and for the remaining computations output from C-R is selected. Multiplexer set, mux2 is used to obtain a new input every kth c lock cycle which then gets stored in the D-R shift register. The third multiplexer set, mux3 is used to obtain the sum of products through accumulation in the mac unit and to clear the accumulator every kth clock cycle. The select signal for the second and the third multiplexer sets is Ã¢â‚¬ËœsÃ¢â‚¬â„¢.The signal Ã¢â‚¬ËœsÃ¢â‚¬â„¢ is made high every kth clock cycle. Tristate buffers, which are enabled by signal Ã¢â‚¬ËœsÃ¢â‚¬â„¢, are used to obtain the final output. The circuit operates at clock frequency Ã¢â‚¬ËœfcÃ¢â‚¬â„¢. An input sample is processed every k clock cycles and therefore the filter operation frequency is fs=fc/k. The frequency of the control signal Ã¢â‚¬ËœsÃ¢â‚¬â„¢ is Ã¢â‚¬ËœfsÃ¢â‚¬â„¢. Fig6. shows the timing diagram of the operation. Fig3. 2. Folded architecture of k-tap FIR filter Fig3. 3. Timing diagram 3. 3 CASCADED ARCHITECTURE OF FIR FILTER: A cascaded structure is obtained by dividing the k-tap filter into many stages. Fig7. hows a twostage structure obtained by dividing an 8-tap FIR filter and introducing a delay between the two stages. Each stage is partially folded and cascaded. Fig8. shows the cascaded architecture using two folded FIR filters in direct form. The first four filter coefficients are stored in the C-R register of stage 1 and the next four coefficients are stored in the C-R register of stage 2. The input sample to the second stage comes from the delay register D-R. The output of the second stage Ã¢â‚¬ËœypÃ¢â‚¬â„¢ is used to initialize the accumulator of the first stage. The final output Ã¢â‚¬ËœyÃ¢â‚¬â„¢ is obtained from the first stage.A delay exists in the sum line from the output of stage 2 to the input of the adder of stage 1 in the Fig. 3. This corresponds to the delay in the input of adder of stage 1 in Fig4. The operation of each stage is similar to that of a normal folded FIR filter. The select signal Ã¢â‚¬ËœsÃ¢â‚¬â„¢ of both the stages are synchronized. The select signal Ã¢â‚¬ËœselÃ¢â‚¬â„¢ to the multiplexer set, m ux1 is used to select the output of the C-R register after four clock cycles as only four coefficients are used by each stage. The second stage produces a partial output Ã¢â‚¬ËœypÃ¢â‚¬â„¢ in one sample cycle, which is used by the first stage in the next sample cycle.Fig3. 4. 8-tap direct form FIR filter divided into two filter stages Fig8. Cascaded structure of two folded filter stages The timing diagram, shown in Fig9. clarifies the operation of the circuit. In this diagram y(n), yp(n) and the filter terms accumulated after every clock cycle by both stages is shown. The Computation of the result y(n) lasts to x(n) sample cycles. During the first cycle, stage 2 computes yp(n). This is available in the last clock cycle and is used as an initial value for the accumulation performed by stage 1 in the second sample cycle.During this cycle stage 1 computes y(n) and stage2 computes yp(n+1) of the next result. Thus the computations of two results y(n) and yp(n+1) are overlapped and we obta in a result every four clock cycles. The select signal Ã¢â‚¬ËœsÃ¢â‚¬â„¢ is made high every fourth clock cycle to get the output. Fig3. 5. Timing diagram 4. MULTIPLIERS: The most critical function carried out by ALU is multiplication Digital multiplication is not the most fundamentally complex operation, but is the most extensively used operation (especially in signal processing) Innumerable schemes have been proposed for realization of the operation 4. BASICS OF DIGITAL MULTIPLICATION: Digital multiplication entails a sequence of additions carried out on partial products The method by which this partial product array is summed to give the final product is the key distinguishing factor amongst the numerous multiplication schemes A 4. 2 ARRAY MULTIPLIER: Partial products are independently computed in parallel Consider two binary numbers A and B, of m and n bits, respectively: Pk is known as the partial product term, also called the summand. Fig4. 1. Multiplier cell 4. 2. 1 BRAUN ARRAY MULTIPLIER: Simplest parallel multiplier. Suited only for positive operands.The partial products are computed in parallel and then collected through a series of carry save adders. The completion time is limited by the depth of the carry save and by the carry propagation in the adder. Fig4. 2. Braun array multiplier 4. 2. 2 RIPPLE CARRY ARRAY MULTIPLIERS: Row ripple form Unrolled shift-add algorithm Delay is proportional to N A ripple carry array multiplier (also called row ripple form) is an unrolled embodiment of the classic shift-add multiplication algorithm. The illustration shows the adder structure used to combine all the bit products in a 44 multiplier.The bit products are the logical and of the bits from each input. They are shown in the form x, y in the drawing. The maximum delay is the path from either LSB input to the MSB of the product, and is the same (ignoring routing delays) regardless of the path taken. The delay is approximately 2*n. Fig4. 3. Ripple carry array mult iplier 4. 2. 3 CARRY SAVE ARRAY MULTIPLIERS: Column ripple form Fundamentally same delay and gate count as row ripple form Gate level speed ups available for ASICs Ripple adder can be replaced with faster carry tree adder Regular routing patternFig4. 5. Carry save array multiplier 4. 3 TREE MULTIPLIER: Offers potential for multiplication in time O (logn) Once partial product array is formed, bits are passed to reduction network Here column-wise compression of the bits takes place, yielding two final partial products Final product is obtained by addition of these two partial products Considered to be irregular in form and does not permit efficient VLSI realization Fig4. 6. Flowchart of a tree multiplier 4. 3. 1 WALLACE TREE MULTIPLIER: Partial Sum adders can be re-arranged in a ree-like fashion, reducing the critical path and the number of cells needed. Fig. (a) Only column 3 has to add 4 bits. All others are less complex Fig. (b) Half Adders (HA) in column 3 ; 4. Fig. (c) Full Adder s (FA) in column 3, 4, and 5; HA in column 2. Fig. (d) Finally, HA from column 1 to 6. Fig4. 7. Transforming a partial product tree into a wallace tree Wallace Tree multiplier implementation. Substantial saving on larger multiplier. Fig4. 8. Wallace tree multiplier 5. SOURCE CODE: 5. 1 UNFOLDED FIR FILTER: 5. 1. TOP MODULE: module FIR_filter(x,h0,h1,h2,h3, h4,h5,h6,h7,clk,y); input[7:0]x,h0,h1,h2,h3,h4,h5,h6,h7; input clk; output[18:0]y; wire[7:0]x0,x1,x2,x3,x4,x5,x6,x7; wire[7:0]x0_bar,x1_bar,x2_bar,x3_bar,x4_bar,x5_bar,x6_bar,x7_bar; wire [15:0]p0,p1,p2,p3,p4,p5,p6,p7; wire[16:0]s0,s1,s2,s3; wire[17:0]s4,s5; delay d0(x,clk,reset,x0,x0_bar); delay d1(x0,clk,reset,x1,x1_bar); delay d2(x1,clk,reset,x2,x2_bar); delay d3(x2,clk,reset,x3,x3_bar); delay d4(x3,clk,reset,x4,x4_bar); delay d5(x4,clk,reset,x5,x5_bar); delay d6(x5,clk,reset,x6,x6_bar); delay d7(x6,clk,reset,x7,x7_bar); mul_8x8 m0(x0,h0,p0); mul_8x8 m1(x1,h1,p1); mul_8x8 m2(x2,h2,p2); mul_8x8 m3(x3,h3,p3); mul_8x8 m4(x4,h4,p4) ; mul_8x8 m5(x5,h5,p5); mul_8x8 m6(x6,h6,p6); mul_8x8 m7(x7,h7,p7); adder_16bit a0(p0,p1,s0), a1(p2,p3,s1), a2(p4,p5,s2), a3(p6,p7,s3); adder_17bit a4(s0,s1,s4), a5(s2,s3,s5); adder_18bit a6(s4,s5,y); endmodule 5. 1. 2 16-BIT ADDER: module adder_16bit(A,B,sum); input [15:0] A,B; output [16:0]sum; assign sum=A+B; endmodule 5. 1. 3 17-BIT ADDER: module adder_17bit(A,B,sum); input[16:0]A,B; input[17:0]sum; assign sum=A+B; endmodule 5. 1. 4 18-BIT ADDER: module adder_18bit(A,B,sum); input[17:0]A,B; output[18:0]sum; assign sum=A+B; endmodule 5. 1. 5 D FLIPFLOP: module delay(D,CLK,reset,Q,Q_bar); input [7:0] D; input CLK,reset; output[7:0] Q; output[7:0]Q_bar; reg [7:0] Q; assign Q_bar=~Q; always @(posedge CLK or negedge reset) if(reset==0)Q

Compare and contrast papers

Monday, October 21, 2019

Vlsi Implementation of Array Based Fir Filter Folding Essays

No comments:

Post a Comment