The development of district-level cooling energy profiles is important for the design, sizing, and operation of district cooling plants. This task is often challenging due to the lack of building stock data, input uncertainties, modeling strategies, and the time and space resolutions needed from the cooling demand profiles. The work presented in this thesis attempts to address these challenges by leveraging a data-driven methodology to produce monthly cooling demand profiles at the district level for a hot and arid climate region. The methodology used in this work is comprised of two main phases. The first conducts a comparative analysis to identify the optimal building archetype development methodology suited for the studied climate zone. The second phase utilizes the obtained results to develop an archetype library for the district’s building stock. This library is then used to generate synthetic data for the training and testing of machine learning models’ districtscale energy prediction. The results obtained in this work indicate that for the building archetypes studied, ML models can accurately predict profiles generated by physics-based models all while reducing the computational time involved. The results also indicate that the uncertainty in predicted cooling demand profiles remains considerable in the absence of realistic input distributions. Ultimately, this study contributes to the current body of work by introducing a BA library for high-rise buildings in hot and climate areas and by proposing an entire pipeline for data generation, ML model training and testing, and data aggregation.