Since electricity plays a crucial role in industrial infrastructures of countries, power companies are trying to monitor and control infrastructures to improve energy management, scheduling and develop efficiency plans. Smart Grids are an example of critical infrastructure which can lead to huge advantages such as providing higher resilience and reducing maintenance cost. Due to the nonlinear nature of electric load data there are high levels of uncertainties in predicting future load. Accurate forecasting is a critical task for stable and efficient energy supply, where load and supply are matched. However, this non-linear nature of loads presents significant challenges for forecasting. Many studies have been carried out on different algorithms for electricity load forecasting including; Deep Neural Networks, Regression-based methods, ARIMA and seasonal ARIMA (SARIMA) which among the most popular ones. This thesis discusses various algorithms analyze their performance for short-term load forecasting. In addition, a new hybrid deep learning model which combines long short-term memory (LSTM) and a convolutional neural network (CNN) has been proposed to carry out load forecasting without using any exogenous variables. The difference between our proposed model and previously hybrid CNN-LSTM models is that in those models, CNN is usually used to extract features while our proposed model focuses on the existing connection between LSTM and CNN. This methodology helps to increase the model's accuracy since the trend analysis and feature extraction process are accomplished, respectively, and they have no effect on each other during these processes. Two real-world data sets, namely "hourly load consumption of Malaysia" as well as "daily power electric consumption of Germany", are used to test and compare the presented models. To evaluate the performance of the tested models, root mean squared error (RMSE), mean absolute percentage error (MAPE) and R-squared were used. The results show that deep neural networks models are good candidates for being used as short-term prediction tools. Moreover, the proposed model improved the accuracy from 83.17\% for LSTM to 91.18\% for the German data. Likewise, the proposed model's accuracy in Malaysian case is 98.23\% which is an excellent result in load forecasting. In total, this thesis is divided into two parts, first part tries to find the best technique for short-term load forecasting, and then in second part the performance of the best technique is discussed. Since the proposed model has the best performance in the first part, this model is challenged to predict the load data of next day, next two days and next 10 days of Malaysian data set as well as next 7 days, next 10 days and next 30 days of German data set. The results show that the proposed model also has performed well where the accuracy of 10 days ahead of Malaysian data is 94.16\% and 30 days ahead of German data is 82.19\%. Since both German and Malaysian data sets are highly aggregated data, a data set from a research building in France is used to challenge the proposed model's performance. The average accuracy from the French experiment is almost 77\% which is reasonable for such a complex data without using any auxiliary variables. However, as Malaysian data and French data includes hourly weather data, the performance of the model after adding weather is evaluated to compare them before using weather data. Results show that weather data can have a positive influence on the model. These results show the strength of the proposed model and how much it is stable in front of some challenging tasks such as forecasting in different time horizons using two different data sets and working with complex data.