Taxi service has become one of the most important means of transportation in the world. Optimization of the taxi service can significantly reduce transportation costs, idle driving times, waiting times, and increase service quality. However, optimization of the taxi service due to its specific characteristics is a cumbersome task. In this research, we studied the taxi dispatching problem and proposed a mathematical programming machine learning-based approach to optimize the network. We presented a data-driven optimization methodology by combining machine learning techniques, that incorporate historical time-series data to forecast future demand, and mathematical programming. Specifically, Support Vector Regression and K-Nearest Neighbor are adopted to learn the passenger demand patterns based on time-series data. Then a MIP model is built to minimize total idle driving distance concerning balancing the supply-demand ratio in different regions. Moreover, we aimed at balancing supply according to the demand in different regions (nodes) of a city in order to increase service efficiency and to minimize the total ideal driving distance. We proposed a method that utilizes historical GPS data to build demand models and applies prediction technologies to determine optimal locations for vacant taxis considering anticipated future demand. From a system-level perspective, we compute optimal dispatch solutions for reaching a globally balanced supply-demand ratio with the least associated cruising distance under practical constraints. We implemented our approach to a real-world case study from New York City to demonstrate its efficiency and effectiveness.