This thesis presents algorithm and architecture for simultaneous diagonalization of matrices. As an example, a subspace-based speech enhancement problem is considered, where in the covariance matrices of the speech and noise are diagonalized simultaneously. In order to compare the system performance of the proposed algorithm, objective measurements of speech enhancement is shown in terms of the signal to noise ratio and mean bark spectral distortion at various noise levels. In addition, an innovative subband analysis technique for subspace-based time-domain constrained speech enhancement technique is proposed. The proposed technique analyses the signal in its subbands to build accurate estimates of the covariance matrices of speech and noise, exploiting the inherent low varying characteristics of speech and noise signals in narrow bands. The subband approach also decreases the computation time by reducing the order of the matrices to be simultaneously diagonalized. Simulation results indicate that the proposed technique performs well under extreme low signal-to-noise-ratio conditions. Further, an architecture is proposed to implement the simultaneous diagonalization scheme. The architecture is implemented on an FPGA primarily to compare the performance measures on hardware and the feasibility of the speech enhancement algorithm in terms of resource utilization, throughput, etc. A Xilinx FPGA is targeted for implementation. FPGA resource utilization re-enforces on the practicability of the design. Also a projection of the design feasibility for an ASIC implementation in terms of transistor count only is included