ISSN 1989 -9572 DOI: 10.47750/jett.2023.14.06.014 # ENERGY-EFFICIENT APPROXIMATE MULTIPLIERS FOR ENHANCED IMAGE PROCESSING APPLICATIONS #1M.ANJAIAH #2D.BABU #3T.SANTHOSH Journal for Educators, Teachers and Trainers, Vol.14(6) https://jett.labosfor.com/ Date of reception: 03 June 2023 Date of revision: 29 Oct 2023 Date of acceptance: 20 Nov 2023 M.ANJAIAH, D.BABU, T.SANTHOSH (2023). ENERGY-EFFICIENT APPROXIMATE MULTIPLIERS FOR ENHANCED IMAGE PROCESSING APPLICATIONS. *Journal for Educators, Teachers and Trainers*, Vol.14 (6). 135–142. # Journal for Educators, Teachers and Trainers, Vol.14(6) ISSN1989– 9572 https://jett.labosfor.com/ # ENERGY-EFFICIENT APPROXIMATE MULTIPLIERS FOR ENHANCED IMAGE PROCESSING APPLICATIONS #1M.ANJAIAH, Assistant Professor, #2D.BABU, Assistant Professor, #3T.SANTHOSH, Assistant Professor, Department of Electronics And Communication Engineering, TRINITY COLLEGE OF ENGINEERING & TECHNOLOGY, KARIMNAGAR, TS. ABSTRACT: The primary objective of this investigation is to create two distinct multipliers, each consisting of approximately 88 components, that utilize compressors with compression ratios of approximately 2:1 and 3:1. Based on the Wallace multiplier principle, approximation Wallace multipliers (AWMs) are multipliers. Utilizing a variety of design metrics (DMs), the efficacy of the proposed AWMs was investigated and evaluated. PDP, latency, voltage, and area were among the items. The efficiency of asynchronous waveform matching (AWM) was investigated by the researchers using six multipliers, all of which were based on prominent 3:2 compressors. Cadence's RTL Compiler (RC) tool and a normal or 180 nm-optimized cell library, which were based on the Verilog design, were employed to generate each multiplier. The findings indicate that the effective treatment of AWMs and DMs can be achieved by combining them. This study examined a vast array of approximation compressors in order to ascertain a variety of image processing algorithms. Other Wallace multipliers were matched with AWMs (approximate Wallace Multipliers) using the Peak Signal-to-Noise Ratio (PSNR). The experimental peak signal-to-noise ratio of the recommended multipliers exceeds 50 dB. Keywords: Approximate Computation, Wallace Multiplier, 3:2 Compressor, Low Power, PDP #### 1. INTRODUCTION A diverse array of computational resources will be necessary to manage multimedia applications, including image processing, audio, video, and music, in the next iteration of embedded systems. Most importantly, these use cases illustrate their ability to gracefully tolerate a diverse array of computational defects. The gear's inherent imprecision may lead to the development of any of these challenges. When contrasted with faultless computing, acceptable computation is generally more rapid and less precise. The digital signal processing (DSP) module is the fundamental component of any mathematical system. They facilitate multimedia applications and execute numerous mathematical operations. Basic arithmetic operations are performed by these DSP components. The accuracy and efficacy of the system in question are contingent upon the adder circuit. Enhancing the energy efficiency and architecture of adder and multiplier circuits for digital signal processing (DSP) is an intriguing area of research. In recent years, multimedia approximative multipliers have garnered a significant amount of scholarly interest and debate. This investigation evaluates the efficacy of the Baugh-Wooley multiplier model by employing a 64-tap finite impulse response (FIR) filter. In this investigation, we present a multiplier approximation for signed and unsigned multiplication that accounts for rounding. Wallace trees and Dadda multipliers necessitate a power-delay product (PDP), energy, time, and area. In order to evaluate the peak signal-to-noise ratio (PSNR), power consumption, and response time of four unsigned approximation multipliers that were recently developed, we implemented transistor counting. Presently, there are numerous compressors with a 4:1 ratio available in the market. The essay provided by the author examines a multiplier that utilizes a modified 4:2 compressor. A significant amount of effort has been invested in the development of numerous auxiliary components that employ mathematical circuits. A greater number of 1-bit full adder (FA) cells were investigated and assessed at various stages of theoretical design. The subsequent chapters are summarized in this section. In Section 2, we provide a more comprehensive analysis of the proposed AWM and its distinctive estimating compressors. The final section of the study pertains to the synthesis environment, the simulation findings, and the subsequent analysis. Furthermore, the image processing capabilities of the proposed multipliers are assessed. This situation may be the concluding one before the narrative concludes. ## 2. DESIGN OF PROPOSED WALLACEMULTIPLIER USING NOVEL APPROXIMATE COMPRESSORS This section delves into the process of generating an 88 AWM using the 2:2 and 3:2 compressors that were previously mentioned. **PROPOSEDCOMPRESSORS** Three new approximation compressors (ACs) are discussed in this section: AC1, AC2, and AC3. The ratio of AC3 to AC4 is 2:1, while AC1 to AC2 is 3:1. In order to create these compressors, we modify the truth tables (TTs) of the precise 2:2 and 3:2 compressors, EC1 and EC2, respectively. Table.1. The purpose of this research is to compare the performance of accurate and approximate 3:1 blowers. | I | Inputs | | EC <sub>1</sub> | | AC <sub>1</sub> | | AC <sub>2</sub> | | |---|--------|-----|-----------------|-----------|------------------|-------|------------------|------------| | A | В | Cin | Sum | $C_{out}$ | Sum <sub>1</sub> | Count | Sum <sub>2</sub> | $C_{out2}$ | | 0 | 0 | 0 | 0 | 0 | 1× | 0√ | $1 \times$ | 0√ | | 0 | 0 | 1 | 1 | 0 | 1√ | 0√ | 1√ | 0√ | | 0 | 1 | 0 | 1 | 0 | 0× | 0√ | 0× | 0√ | | 0 | 1 | 1 | 0 | 1 | 0√ | 0× | 0√ | 0× | | 1 | 0 | 0 | 1 | 0 | 0× | 1× | 0× | 0√ | | 1 | 0 | 1 | 0 | 1 | 0√ | 1√ | 0√ | 0× | | | | | | | | | | | | 1 | 1 | 0 | 0 | 1 | 0√ | 1√ | 0√ | 1√ | | 1 | 1 | 1 | 1 | 1 | 0× | 1√ | 0× | 1√ | | | ED | | | | 4 | 2 | 4 | 2 | Table.2. The hypothesized and actual tables for a 2:2 compressor will be investigated in this investigation. | Inputs | | EC <sub>2</sub> | | $AC_3$ | | | |--------|---|-----------------|------|------------|-------|--| | X | Y | Sum | Cout | Sum | Couts | | | 0 | 0 | 0 | 0 | $1 \times$ | 0√ | | | 0 | 1 | 1 | 0 | 1√ | 0√ | | | 1 | 0 | 1 | 0 | 1√ | 0√ | | | 1 1 | | 0 | 1 | 0√ | 1√ | | | E | D | - | | 1 | 0 | | Table.3. Future predictions about alternating current (AC) behavior that make sense. | Dunnand Communication | Output Equations | | | |-----------------------|---------------------|-------------|--| | Proposed Compressor | Sum | Cout | | | $AC_1$ | $\bar{A} + \bar{B}$ | A | | | $AC_2$ | $\bar{A} + \bar{B}$ | $A \cdot B$ | | | $AC_3$ | $\bar{A} + \bar{B}$ | $A \cdot B$ | | The quantity of heat that air conditioners (ACs) permit to pass through is illustrated in Table 1. The time required to produce an electronic component (EC) is delineated in Table 2. In this instance, items A, B, and C are pertinent. The preceding example offers six alternative outcomes: Cout, Sum, Sum1, Sum2, and Cout2. Three Sum receptacles are present in EC1, AC1, and AC2. A multitude of individuals have been interviewed; some of their names include Sum, Sum1, and Sum2. The Carry outputs from EC1, AC1, and AC2 are denoted as Cout, Cout1, and Cout2, respectively. The results for X and Y are presented in Table 2 as Sum, Cout, Sum3, and Cout3. The sums of EC2 and AC3 are represented by Sum and Sum3. The statistics of Cout and Cout3 indicate that EC2 and AC3 are Carry. We multiply the anticipated results by the smallest term number that is most feasible in order to acquire the truth tables (TTs) for each AC. In order to guarantee the validity of the logic equations for a specific output approximation, we select the total number of min-terms and the order in which they occur. It would be excellent if the error distance (ED) could be adjusted. In the simplified output equation, the product total should be represented by a single term. Reduce the length of your Statement of Purpose (SOP) by limiting the amount of text on a single line. Assess the extent to which the actual data deviates from the anticipated data prior to estimating the margin of error. The logic diagrams and reduced output equations for the old and new compressors are depicted in Table 3. Fig.1.Sorry, but I require further material or context to rewrite your academic work. This article presents an overview of the data flow map proposed as a method of data compression. #### DESIGN OF APPROXIMATEWALLACEMULTIPLIER This study examines the concept of an approximation Wallace multiplier (AWM) in New Mexico and demonstrates how to construct one using the proposed approximate compressors (ACs). The multiplier symbol is M, while the multiplicand symbol is N. This investigation pertains to two variables, N and M, each of which comprises eight bits of integer value. The precision compressor's Wallace multiplier is composed of three independent components. in computer mathematics, there are two varieties of addition techniques: partial product compression (PPC) and carry-propagated addition (CPA). 64 PPs are generated when the PPG block is divided into individual parts. An 8 by 15-inch grid connects the parts. In order to complete the binary addition, the PPC block must divide the eight-row PP array in half. Numerous compressor stages are necessary to alleviate the pressure. Fig.2.I'm sorry, but I won't be able to react until I receive a text from you. The goal of this research is to find out how effective a tool is at lowering the power consumption of an 8x8 Wallace multiplier. The result is achieved by combining the two rows of prepositional phrases. In terms of power consumption, Figure 2 suggests that a 3:1 compressor is more efficient than a 2:2 compressor. In an effort to evaluate the efficacy of the proposed Autonomous Controllers (ACs), two Advanced Warfare Machines (AWMs) were developed and evaluated. The AWM is referred to as Proposed AWM1 (PAWM1) due to the fact that it incorporates AC1 and AC3 into its architecture. The AWM is referred to as Proposed AWM2 (PAWM2) due to the use of AC2 and AC3 in its production. PAWM1 and PAWM2 were constructed using the reduction circuits depicted in Figure 2. The terms "Most Significant Bits" (MSB) and "Least Significant Bits" (LSB) pertain to two distinct categories of bits. Stages 1-4 of the partial product (PP) reduction and Stages 1-4 of the least significant bit (LSB) row addition employ 2:2 and 3:2 compressors, respectively, prior to and following Stage 5. The compressors in the Main Storage Building (MSB) were found to be comparable. The least significant bit (LSB) region of PAWM1 is utilized by AC1 and AC3. It is implemented by both AC2 and AC3 in PAWM2. ### 3. SIMULATION RESULTS AND DISCUSSION Based on the results of the experiment, we evaluate the efficacy of the multipliers that have been recommended and draw conclusions about their performance. For instructional and assessment purposes, numerous Automated Writing Assessment Systems (AWMs) have been extensively implemented in educational institutions. The manufacturers of these weapons also refer to them as Designed AWMs (DAWMs). The figure suggests that... In order to facilitate the operation of each DAWM, reduction circuits were implemented. DAWMs, which are frequently referred to as "reported compressors," employ a variety of compressor designs. The solutions to the logical equations in the truth table were employed to generate each RC. The logic equations necessary for the interpretation of the RC outputs are presented in Table 4. | • | C | ` / I | | | |-----------------|-----------------------------------------------------------------------------------------------------------|-----------------------------------------------|------|--| | RC | Output Equations | | | | | | Sum | Cout | Ref. | | | $RC_1$ | $A \oplus B \oplus C_{in}$ | Cin | [12] | | | $RC_2$ | $A \oplus B$ | A-B | [12] | | | RC <sub>3</sub> | $\overline{A} \cdot \overline{B} + \overline{A} \cdot \overline{C}_{in} + \overline{B} \overline{C}_{in}$ | $A \cdot B + A \cdot C_{in} + B \cdot C_{in}$ | [12] | | | RC <sub>4</sub> | $\overline{C}_{in} \cdot \overline{A} \oplus \overline{B}$ | B+A·C <sub>in</sub> | [15] | | | RC <sub>5</sub> | $\overline{A} \cdot \overline{B} + \overline{A} \cdot \overline{C}_{in}$ | B+A·Cin | [15] | | | RC <sub>6</sub> | $\overline{A} \cdot C_{in} + B \cdot C_{in}$ | A | [15] | | Table.4.Commentary on the logical results of the ACs (RCs) provided in Table IV. Table.5. The mathematical rules that govern the operation of the Precise Compressor EC are as follows: | EC | Output Equations | | | | |-----------------|----------------------------|-----------------------------------------------|--|--| | EC | Sum | Cout | | | | $EC_1$ | $A \oplus B \oplus C_{in}$ | $A \cdot B + A \cdot C_{in} + B \cdot C_{in}$ | | | | EC <sub>2</sub> | $A \oplus B$ | A-B | | | Additionally, we constructed and conducted an analysis of the identical Wallace Multiplier (EWM). The error weighted measure (EWM) LSB and MSB were computed by the CPUs EC1 and EC2. The rationale behind the compressor designs that were examined is summarized in Table 5. Using the parameters (PAWM1 and PAWM2) provided, the power-delay product (PDP), delay, and area dynamic metrics (DMs) are calculated. The success of these studies is subsequently assessed in comparison to the energy-weighted multiplier (EWM) and the delay-area weighted multiplier (DAWM), which are the present measures. #### **SYNTHESISENVIRONMENT** The multipliers utilized in the construction of each of the DMs mentioned above were generated using Verilog Register Transfer Level (RTL) instructions. Cadence's NCSim utility offers Verilog test bench code for the purpose of examining potential multipliers. A conventional Verilog test bench method was employed to evaluate each prospective multiplier. The Cadence RTL Compiler (RC) generates RTL codes that satisfy the Process-Voltage-Temperature (PVT) criteria. Figure 3 illustrates the process of modifying the default settings to prohibit direct messages (DMs). The Verilog RTL code and the TSMC standard cell library, which were developed for 180 nm technology, are essential components of the construction tool. Without them, it is impossible to operate. Cadence's Register Transfer Level (RTL) translator RC is employed to extract the data from the aforementioned sources. This translation generates a netlist at the gate level that can be used to identify the pertinent Design Modules (DMs). Verilog RTL algorithms were employed to produce each multiplier under investigation at a temperature of 27 degrees Celsius and an input voltage (Vdd) of 1.8volts. The comparison's validity was thus confirmed. Fig.3. How much imagination is required for effective decision making? Subsequent to the synthesis process, we adjusted the input-drive intensity and output-load levels to facilitate DM extraction. We provide a comprehensive analysis of the synthesis results and compare them to the efficacy of Wallace multipliers that were previously developed. In order to simulate this, delay metrics (DMs) are developed and implemented. | Approximate | Power (nW) | | | Delay | PDP | Area | |-------------|------------|------------|------------|-------|---------|----------| | Multiplier | Leakage | Dynamic | Total | (ps) | (fJ) | (μm²) | | EWM | 31.978 | 505595.093 | 505627.071 | 6275 | 3172.80 | 4746.773 | | DAWM1 | 26.814 | 482868.649 | 482895.463 | 5168 | 2495.60 | 4194.590 | | DAWM2 | 30.460 | 459866.829 | 459897.289 | 5081 | 2336.73 | 4613.717 | | DAWM3 | 22.727 | 368506.408 | 368529.135 | 5081 | 1872.49 | 3915.173 | | DAWM4 | 28.364 | 389600.886 | 389629.249 | 5081 | 1979.70 | 4404.154 | | DAWM5 | 20.948 | 359891.180 | 359912.129 | 5081 | 1828.71 | 3565.901 | | DAWM6 | 21.186 | 338063.774 | 338084.960 | 5081 | 1717.81 | 3705.610 | | PAWM1 | 18.857 | 300213.272 | 300232.129 | 5081 | 1525.48 | 2990.434 | | PAWM2 | 21.087 | 294411.904 | 294432.991 | 5081 | 1496.01 | 3386.275 | The conclusions are corroborated by the data in this table. According to the Power column, the respective total powers of the proposed multipliers PAWM1 and PAWM2 are 294.41 W and 300.21 W, respectively. All of the comparison multipliers that have been compiled suggest that the ones that have been enumerated are not particularly important. The compressors that were previously described consume less energy due to their design. The Delay column analysis indicated that all multipliers that were recommended had the same delay. The latency is 5,081 nanoseconds. The aforementioned multipliers and DAWMs 2, 3, 4, 5, and 6 exhibit no discernible difference in latency. The PDP column data indicates that PAWM1 and PAWM2 experience power loss at distinct rates (1525.51 fJ and 1496.01 fJ, respectively). It is important to bear in mind that the PDP quantities being discussed are the absolute minimum that can be implemented. You are also free to employ alternative multipliers. The multipliers that have been recommended have lengthier reaction times and require less power to generate the lowest power-delay product (PDP). The PAWM1 and PAWM2 compressors are more space-efficient than their predecessors, as evidenced by the area column. The initial measurement is 2990.43 square meters, while the subsequent measurement is 3386.27 square meters. The proposed multipliers outperform all other multipliers in terms of direct messages (DMs), as evidenced by the data. Consequently, the multipliers that are recommended are exceptional for image processing applications that are constrained by both power and space. ### **APPLICATIONOFPROPOSEDMULTIPLIER** The following image processing tasks are illustrated using the recommended parameters. Image multiplication is a widely used method in image processing to enhance and enlarge images. The objective of this investigation is to evaluate the effectiveness of the parameters that have been designated for use with visual data. A composite image is created by combining two distinct test photos. Fig.4.To enlarge test images, 8-by-8-inch Wallace multipliers are used. The test photos used in this method are images 1 and 2. The precise multiplication has been performed on these photos. Results of visual evaluations utilized to evaluate the efficacy of the procedure are illustrated in Figures 4(a) and (b). The standard file format for storing images is Portable Network Graphics (PNG). Its typical size is 512 bytes. These test photographs were chosen due to their absence of image analysis components. The total number of pixels in all test photos is 262,144, with 512 pixels in each image. The number type in question is an 8-bit unsigned integer that can take on values between 0 and 250. Each pixel is individually combined in the two representative images. Figure 4(c) illustrates the result of multiplying the sample bullets by 88. The peak signal-to-noise ratio (PSNR) of a variety of multiplier designs is compared to all other designs through the use of photomultiplication. A selection of the photographs that were captured is illustrated in Figure 5. Fig.5. To A specific procedure was used to duplicate the test images. One of the 88 Wallace multipliers available on the spot was used. Employing Figure. The peak signal-to-noise ratio (PSNR) is calculated for each multiplier in MATLAB to assess image fidelity. The maximum signal-to-noise ratios (PSNR) for each design are depicted in Table 7. The peak signal-to-noise ratios (PSNRs) of PAWM1 and PAWM2 are 51.355 dB and 51 dB, respectively, as shown in Table 7. The integers that are similar will be included in additional multiplication fact sets. In contrast to DAWM2, DAWM4, and DAWM6, the multipliers that have been recommended generate higher Peak Signal-to-Noise Ratios (PSNRs). Nevertheless, their performance is inferior to that of DAWM1, DAWM3, or DAWM5. The most common assumption is that the PAWM1 and PAWM2 repeater designs will have a higher peak signal-to-noise ratio (PSNR) than the ones that are currently in use. The Peak Signal-based method is employed to compare and evaluate the anticipated and generated Wallace multipliers in Figure 7. Table.7. The planned and generated Wallace multipliers are compared and evaluated using the Peak Signal-to-Noise Ratio (PSNR). | AWM | PSNR (dB) | |-------|-----------| | DAWM1 | 52.996 | | DAWM2 | 50.688 | | DAWM3 | 52.106 | | DAWM4 | 48.744 | | DAWM5 | 55.511 | | DAWM6 | 47.386 | | PAWM1 | 51.000 | | PAWM2 | 51.355 | ### 4. CONCLUSION The proposed estimate Wallace multipliers PAWM1 and PAWM2 were constructed and evaluated in this investigation. Additionally, they were employed to compare the multipliers that were recommended with previous iterations. The inquiry's results indicated that the prescription DM enhancers outperformed the respective replacements. The proposed multipliers are applied to the specified multiplication, resulting in high-quality photographs with PSNR values that are either accurate or close to accurate. For image processing applications that necessitate a high peak signal-to-noise ratio (PSNR), low power consumption, and a small form factor, it is logical to implement the prescribed multipliers. #### REFERENCES - 1. R.R.OsorioandG.Rodriguez,TruncatedSIMDMultiplierArchitectureforApproximateComputing inLow-PowerProgrammableProcessors,IEEEAccess,Vol.7,pp.56353-56366, 2019. - 2. H. Jiang, C. Liu, F. Lombardi and J. Han, Low-PowerApproximate Unsigned Multipliers with Configurable ErrorRecovery, IEEE Transactions on Circuits and Systems-I:RegularPapers, Vol. 66, No. 1, pp. 189-202, 2019. - 3. L.B. Soares, M.M. Azevedo Da Rosa, C.M. Diniz, E.A.C.CostaandS.Bampi,DesignMethodologytoExploreHybrid, Approximate Adders for Energy-Efficient Imageand Video Processing Accelerators, IEEE Transactions onCircuits and Systems-I: Regular Papers, Vol. 66, No. 6, pp.2137-2150,2019. - 4. I. Alouani, H. Ahangari, O. Ozturk and S. Nair, A NovelHeterogeneous Approximate Multiplier for Low Power and High Performance, IEEE Embedded System Letters, Vol.10,No.2,pp.45-48,2018. - 5. S. Ataei and J.E. Stine, A 64 kB Approximate SRAMArchitectureforLow-PowerVideoApplications,IEEEEmbeddedSystem Letters,Vol. 10, No. 1,pp. 10-13,2018. - 6. Minho Ha and Sunggu Lee, Multipliers with Approximate4:2CompressorsandErrorRecoveryModules,IEEEEmbeddedSystemsLetters, Vol.10,No. 1,pp. 6-9, 2018. - 7. M. Ostal, A. Ibrahim, H. Chible and M. Valle, InexactArithmetic Circuits for Energy Efficient IoT Sensors DataProcessing, Proceedings of IEEE International Symposium on Circuits and Systems, pp. 1-4, 2018. - 8. W. Liu, J. Xu, D. Wang, C. Wang, P. Montuschi and F.Lombardi, Designand Evaluation of Approximate Logarithmic Multipliers for Low Power Error-Tolerant Application, Transactions on Circuits and Systems-I: Regular Papers, Vol. 65, No. 9, pp. 2856-2868, 2018. - 9. C.V.Gowdar, M.C.Parameshwaraand Sonoli, Comparative Analysis of Various Approximate Full Adders under RTL Codes, ICTA CTJ ournal on Microelectronics, Vol. 6, No 2, pp. 947-952, 2020. - 10. C.V.Gowdar,M.C.ParameshwaraandSSonoli,ApproximateFullAddersforMultimediaProcessingApplications,ProceedingsofIE EEInternationalConferenceforInnovationinTechnology,pp.1-4,2020.