Traffic forecasting is a vital part of intelligent transportation systems. It becomes particularly challenging due to short-term (e.g., accidents, constructions) and long-term (e.g., peak-hour, seasonal, weather) traffic patterns. While most of the previously proposed techniques focus on normal condition forecasting, a single framework for extreme condition traffic forecasting does not exist. To address this need, we propose to take a deep learning approach. We build a deep neural network based on long short term memory (LSTM) units. We apply Deep LSTM to forecast peak-hour traffic and manage to identify unique characteristics of the traffic data. We further improve the model for post-accident forecasting with Mixture Deep LSTM model. It jointly models the normal condition traffic and the pattern of accidents. We evaluate our model on a real-world large-scale traffic dataset in Los Angeles. When trained end-to-end with suitable regularization, our approach achieves 30%–50% improvement over baselines. We also demonstrate a novel technique to interpret the model with signal stimulation. We note interesting observations from the trained neural network.