# Biomedical Signal Processing: A High-Speed Multiplier for Medical Applications Z. Shaheel Hameed <sup>1</sup>, P. Boopathi <sup>2</sup>, M. Venkatesh <sup>3</sup>, B. Rajeshwari <sup>4</sup> <sup>1</sup>Assistant Professor, <sup>2,3,4</sup>Final Year B.E. ECE Department of Electronics and Communication Engineering Al-Ameen Engineering College (Autonomous) Erode – 638 104, Tamilnadu, India #### Abstract: Multipliers play a crucial role in biomedical signal processing, enabling efficient computation in medical imaging, biosignal analysis, and real-time diagnostics. With the growing demand for high-speed, low-power processing in portable and implantable medical devices, optimizing multiplier architecture is essential. This study presents a Compact VLSI design for a four-bit multiplier tailored for biomedical applications requiring minimal power consumption and high computational efficiency. The proposed design integrates a novel hybrid single-bit full adder with the Dadda algorithm to enhance performance. Compared to conventional multipliers, the proposed architecture achieves a 65.9% reduction in latency and a 24.5% decrease in power consumption, making it highly suitable for low-energy medical devices. The design is synthesized and simulated using the CADENCE 5.1.0 EDA tool with Spectre Virtuoso, ensuring precision and feasibility for biomedical applications. This research aims to contribute to the advancement of VLSI-based medical signal processing, optimizing real-time healthcare monitoring and diagnostics. **Keywords:** multiplier, architecture, high-speed, efficiency, applications, VLSI design # INTRODUCTION In arithmetic operations, multiplication was a crucial fundamental function. Multiplication-based operations like MAC and internal brand are among the most commonly utilised CIAF in DSP applications including convolution, FFT, and filtering, as well as in microprocessors' logic and arithmetic units. Because multiplication takes up the majority of the processing time in most DSP methods, a high-speed multiplier was required. Multiplication duration is still the most important component in determining a DSP chip's command processing times. As the number of technology and data application areas grows, so does the need for high computing. In many real-time digital signal processing systems, higher capacity arithmetic operations are required to attain the desired speed. [1-3] Multiplication was one of the most important mathematical operations in such systems, and the creation of fast multiplier circuits has piqued attention for decades. For many applications, minimizing the delay time and energy consumption are critical criteria. Different multiplier topologies are presented in this paper. One of the quick and low-power multipliers is based on Empirical Formulas.[4]For space-based Earth research and planetary travel, establishing computationally effective processing algorithms for enormous volumes of hyperspectral data is crucial. [5]The accessibility of satellite image data from a variety of sensors on a variety of platforms with a broad range of spatiotemporal, radiometric, and spectral sensitivities has rendered remotely sensed the ideal source of information for large-scale research and applications. There were reports of RS uses in hydrologic models, watershed monitoring, water and energy flux calculation, fractional forest cover, impermeable surface morphology mapping, urban model construction, and drought statistical inferences on remote sensing techniques.[6-8] In addition, many RS imaging uses, like target tracking for army and homeland defense/security purposes, and risk monitoring and response, need a reaction in (near) real - time basis. Imaging system was a multispectral imaging technology that creates images for the identical area on the Earth's surface with thousands of spectral bands at multiple wavelength channels. Despite recent efforts to include distributed and parallel computing into hyperspectral image processing, there are no standardised designs or VLSI circuits for this use in sensor technologies.[9] Furthermore, while existing theory provides a variety of numerical and descriptive regularisation methods for image augmentation, there are still some unanswered crucial hypothesis and internal flaws in many RS software areas linked to the computation complexity due to newly invented sophisticated procedures. In the case of SAR systems, these descriptive-regularization methods are linked to unknown facts of random instabilities of cues in tumultuous medium, incomplete array measurement, finite dimensions of metrics, cumulative signal-dependent impulse noise, unchecked antenna sound waves, and arbitrary carrier path deviations.[10-12] Furthermore, with current DSP or PC, these methods are not suited for (near) practical implementation. The utilisation of specialised arrays of processing in VLSI designs as co - processors or hold chips in combination with FPGA units via hardware or software co-design will be a genuine option Cuest.fisioter.2024.53(3):416-428 for high-speed SP in order to reach the desired data system throughput. It's also worth noting that, while cluster-based computing is the most extensively utilised platform on base stations, it's unfeasible for on-board processing due to size, cost, and power constraints. [13]In both cluster-based systems and integrated devices, FPGA-based reconfigurable processes in combination with proprietary VLSI designs are emerging as novel solutions that offer huge computation capability. [14] In this paper, we focus on two specific contributions linked to the Descriptive-Regularized RS iterative reconstruction technique's significant reduction in computational burden due to its execution with massively cpu arrays through the agglomeration of mid range VLSI frameworks on an FPGA portal. First, we discuss the algorithmic layout of a family of DescriptiveRegularization methods in the unclear RS surroundings over spectrum and azimuth location, as well as the relevant algorithmic formulas for their implementation to imaging cluster radars and partial imaging SAR functioning in various uncertain situations. #### 2. PROPOSED METHOD ### 2.1. Algorithm Dadda The performance of the critical route of the multiplier is lowered in the presented design by using the dadda method to shorten the tree [8]. The four-bit multiplier presented is made up of 16 sub - blocks. Figure 1 depicts a sample of 4x4 multiplications with a four-leaf tree. The tree's length is decreased to two by using the dada method. | | | MD3 MD2 MD1 MD0<br>× MR3 MR2 MR1 MR0 | | | | | | |-------|-------|--------------------------------------|-------|--------------------|------------------|--------|-------| | MD3MR | | 2 MD2MR | | MD1MR1<br>2 MD0MR2 | MD1MR0<br>MD0MR1 | MD0MR0 | | | Prod7 | Prod6 | Prod5 | Prod4 | Prod3 | Prod2 | Prod1 | Prod0 | # Fig 1.Multiplication (Specimen 4×4) The Dadda method lowers the propagation time by not requiring any previous Level result to calculate the next Part output. The dada method reduces the tree size from 4 to two in the initial level. The length is lowered from 3 to 2 in the 2nd Layer, and the overall height of the amplification tree is decreased to 2 in the final Stage. Figure 2 depicts the level-by-level reduction technique. MD3MR3 MD3MR2 MD3MR1 MD3MR0 MD2MR0 MD1MR0 MD0MR0 MD2MR3 MD2MR2 MD2MR1 MD1MR1 MD0MR1 MD1MR3 MD1MR2 MD0MR2 MD0MR3 Fig 2 .last level # 2.2. Multiplier 44% Proposed Figure 5 shows a general schematic diagram of a planned 4x4 multiplier. This multiplier is formulated as a mixed 3 input binary characters adder circuit with pass transistor logic and a 2 input digital numbers adder circuit. There are ten transistors in this circuit. There are 16 part outcomes in the initial level of the multiplier, which are formed and executed using 16 logical AND gates. The length of the tree is cut in half in the 2nd layer by employing three 3-input hex digits adder circuits and one two discrete digits adder circuit. The size is even further decreased in the third level by using only two two-input binary digits adder circuits and two three-input binary numbers adder circuits. . Fig.3. The projected 4x4 multiplier's design 8 three input binary digits adder circuits, four 2 input quantum digits adder circuits, and eight buffers make up the suggested architecture. Figure 4 shows the schematic diagram of the basic AND cell. Fig 4. GATE AND In the suggested multiplier, Fig. 5 depicts the circuit design of a single adder user. T1, T2, Cuest.fisioter.2024.53(3):416-428 and T3 are the 3 modules. The GDI XOR gate is unit T1. PTL XOR and PTL MUX are the PTL XOR and PTL MUX modules, respectively. The production of the total of complete adder is handled by modules T1 and T2. Unit T3 is used to produce the carry signal. Fig.5. Suggested complete adder using a single bit The design of a multiplier will rely heavily on a two-input binary digits adder circuit. 4 two discrete digits adder circuits are employed in the 4x4 multiplier design. Fig.6 shows the CMOS schematic representation of the two signals binary digits full adder. Fig.6. Adder circuit using 2 binary digits as inputs The multiplier's buffer was utilized to transport signals from beginning to final levels while also retaining the reference voltage. Figure 7 depicts the situation. Fig.7.Buffer ### 3. SIMULATION RESULTS The suggested multiplier is created using the CADENCE 5.1.0 EDA tool and simulated with the spectre virtuoso software. The suggested multiplier's RTL schematic view in cadence is shown in Fig.8. The suggested 4 bit multiplier's dynamic model is shown in the diagram below. Fig.8.In cadence, a technology chart of an existing model is shown. Fig 9.The multiplier's transent response The power and latency calculations in cadence are shown in the diagrams below. Figure 10. Calculation of power in cadence Fig.11. Calculation of cadence latency Table 1 shows a comparison of various multipliers. | No | Multiplier | No of<br>Transistors | Power in mW | Delay in ns | |----|---------------------------------------------------|----------------------|-------------|-------------| | 1 | Using conventional CMOS Full adder | 392 | 0.0058 | 3.834 | | 2 | Using Hybrid Full adder (Existing) | 264 | 0.00224 | 3.0603 | | 3 | 4-bit Dadda Multiplier using Compressor | 376 | 1.172 | 0.353 | | 4 | DADDA Tree<br>Multiplier using<br>adiabatic logic | - | 77 | - | | 5 | 4-bit static CMOS based DADDA Multiplier | 316 | - | - | | 6 | Proposed Multiplier | 248 | 0.00169 | 1.0409 | Table I compares the performance of various multipliers utilising various types of full adders in terms of energy consumption, critical chain delay, and transistor count. From this table, it can be seen that the suggested multiplier has a lower computational complexity, has a lower energy consumption of 2W, and has a critical path latency of 1.04ns, which is quite short when compared to other current multipliers. The graphical representations of propagation delay are shown in Figures 12 and 13. Fig.12. Power in milliwatts (mW) comparison Fig.13.Correlation of nanosecond delays The logic use of several modules in a four-bit multiplier is shown in Table II. # **Table II Logic Ultilazation Of Multiple Components** | Module | No of Transistor | Technique used | |----------------------------------|------------------|-------------------------| | Half adder | 40 | CMOS Process Technology | | Buffer | 32 | CMOS Process Technology | | AND gate | 96 | CMOS Process Technology | | Full adder (area and power | 80 | Sum: GDI XOR | | efficient single bit full adder) | 50 | Carry: PTL XOR | #### **CONCLUSIONS:** To achieve low power, minimum latency, and optimal circuit complexity, the suggested 4 bit multiplier employs a hybrid efficient single bit 3-input binary digits adder circuit. In a three input binary digits adder device, PTL and GDI techniques are employed to achieve low energy consumption and little transmission latency. A three-input binary digits adder circuit with hybrid inputs achieves the fastest response time and highest throughput. When the dada method is applied, the propagation time is reduced. The 44 multiplier uses an average of 1.69 Watts and has a propagation latency of 1.04 nanoseconds. These calculated parameters are substantially lower as compared to the previous multiplier design. #### **REFERENCES:** - 1. Hussain, Inamul, Chandan Kumar Pandey, and Saurabh Chaudhury. "Design and analysis of high performance multiplier circuit." 2019 Devices for Integrated Circuit (DevIC). IEEE, 2019. - Ganjikunta, Ganesh Kumar, Sibghatullah I. Khan, and M. Mahaboob Basha. "A High-Performance Signed-Unsigned Multiplier Using Vedic Mathematics." *Journal of Low Power* Cuest.fisioter.2024.53(3):416-428 Electronics 15.3 (2019): 302-308. - 3. Rahnamaei, Ali, Gholamreza Zare Fatin, and Abdollah Eskandarian. "High speed Radix-4 Booth scheme in CNTFET technology for high performance parallel multipliers." International Journal of Nano Dimension 10.3 (2019): 281-290. - 4. Roy, Debapriya Basu, and Debdeep Mukhopadhyay. "High-speed implementation of ECC scalar multiplication in GF (p) for generic Montgomery curves." IEEE transactions on very large scale integration (VLSI) systems 27.7 (2019): 1587-1600. - 5. Sakthivel R., Sundareswari K., Mathiyalagan K., Santra S." Reliable H∞ Stabilization of Fuzzy Systems with Random Delay Via Nonlinear Retarded Control" Circuits, Systems, and Signal Processing (2016). - 6. Karuppusamy, P. "Design and analysis of low-power. high-speed baugh wooley multiplier." Journal of Electronics 1.02 (2019): 60-70. - 7. Deepa, A., and C. N. Marimuthu. "Design of a high speed Vedic multiplier and square architecture based on Yavadunam Sutra." Sādhanā 44.9 (2019): 1-10. - 8. Nithya, J., and S. R. Ramesh. "Design of Delay Efficient Hybrid Adder for High Speed Applications." 2019 5th International Conference on Advanced Computing & Communication Systems (ICACCS). IEEE, 2019. - 9. Balaji G., Vengataasalam S., Sekar S." Numerical investigation of second order singular system using single-term haar wavelet series method" Research Journal of Applied *Sciences* (2013). - 10. Bagherizadeh, Mehdi, Mohammad Hossein Moaiyeri, and Mohammad Eshghi. "A high-performance 5-to-2 compressor cell based on carbon nanotube FETs." International Journal of Electronics 106.6 (2019): 912-927. - Desai, Krupa, Anand D. Darji, and Harikrishna M. Singapuri. "Implementation of 11. High speed, Low PowerModified Vedic Multiplier and Its Application in Lifting based Discrete Wavelet Transform." TENCON 2019-2019 IEEE Region 10 Conference (TENCON). IEEE, 2019. - 12. Chaitanya, C. V. S., et al. "Asic design of low power-delay product carry pre-Cuest.fisioter.2024.53(3):416-428 computation based multiplier." *Indonesian Journal of Electrical Engineering and Computer Science* 13.2 (2019): 845-852. - 13. Chaitanya, C. V. S., et al. "Design of modified booth based multiplier with carry precomputation." *Indonesian Journal of Electrical Engineering and Computer Science* 13.3 (2019): 1048-1055. - 14. Mathana, J. M., R. Dhanagopal, and R. Menaka. "VLSI Architecture for High Performance Wallace Tree Encoder." 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS). IEEE, 2020. - 15. Parihar, Aashish, and Sangeeta Nakhate. "High-speed high-throughput vlsi architecture for rsa montgomery modular multiplication with efficient format conversion." *Journal of The Institution of Engineers (India): Series B* 100.3 (2019): 217-222. - 16. Sankaram, R. Bheema, And Guthula Sailakshmi. "Design And Implementation Of A High Speed And Area Efficient Vlsi Architecture Of Binary Adder." *Turkish Journal of Computer and Mathematics Education (TURCOMAT)* 12.12 (2021): 4819-4825. - 17. JOHN, TINTU MARY, and SHANTY CHACKO. "Design of high speed VLSI Architecture for FIR filter using FPPE."