Load forecasting with low prediction error is essential to keep minimizing costs in generating and supplying power. It has many applications in energy production, distribution, and infrastructure construction. Because of the high autocorrelation and strong seasonality in load data, it is difficult to build robust and generalizable forecasting models. To address the problem, we propose a hybrid model, the Fourier Split NET (FSNET). The proposed model consists of two phases. A deseasonalization phase where the model uses the Fourier transform to isolate the seasonal component from the data using the fast Fourier transform. The second phase consists of training a simple linear model to replicate the seasonal behavior of the data and training a group of LSTM neural networks on different clusters of the data. The model uses statistical features to build separate LSTM models for different groups of data. We experimented on open datasets and obtained higher accuracy results compared to other forecasting approaches using different accuracy metrics. In a second contribution, we propose a novel approach for load forecasting that leverages the task affinity score to measure the distance between different tasks. The task affinity score provides a more effective method for measuring the similarity between tasks in a transfer learning context. We demonstrate the efficacy of the task affinity score through empirical analysis using a synthetic dataset. Our results show that the task affinity score outperforms other intuitive metrics such as the loss function for task selection. To apply this approach, we present the Affinity-Driven Transfer Learning (ADTL) algorithm for load forecasting. The ADTL algorithm optimizes the transfer learning process by leveraging knowledge from pre-trained models and datasets to improve the accuracy of load forecasting for new and previously unseen datasets. We validate the effectiveness of the ADTL algorithm by testing it on two real-world datasets: the Australian Energy Market Operator (AEMO) dataset and the Smart Australian dataset. Overall, our study highlights the importance of the task affinity score in transfer learning for load forecasting applications. The proposed ADTL algorithm provides a practical solution for improving the efficiency and convergence speed of load forecasting in the energy industry.