Traditional fast Discrete Cosine Transform (DCT)/ Inverse DCT (IDCT) algorithms have focused on reducing arithmetic complexity and have fixed run-time complexities regardless of the input. Recently, data-dependent signal processing has been applied to the DCT/IDCT. These algorithms have variable run-time complexities. A new two-dimensional 8 x 8 low-power DCT/IDCT design is implemented using VHDL by applying the data-dependent signal-processing concept onto the traditional fixed-complexity fast DCT/IDCT algorithm. To reduce power, the design is based on Loeffler's fast algorithm, which uses a low number of multiplications. On top of that, zero bypassing, data segmentation, input truncation, and hardwired canonical sign-digit (CSD) multipliers are used to reduce the run-time computation, hence reduce the switching activities and the power. When synthesized using Canadian Microelectronic Corporation 3-V 0.35 om CMOSP technology, this FDCT/IDCT design consumes 122.7/124.9 mW with clock frequency of 40MHz and processing rate of 320M sample/sec. With technology scaling to 0.35 om technology, the proposed design features lower switching capacitance per sample, i.e. more power-efficient, than other previously reported high-performance FDCT/IDCT designs.* *This work is supported by National Sciences and Engineering Research Council of Canada (NSERC) post-graduate scholarship, and NSERC research grants.