It is common nowadays for e-commerce websites to encourage their users to rate shopping items and write review text. This review text information has been proven to be very useful in understanding user preferences and item properties, and thus enhances the capability of these websites to make personalized recommendations. In this paper, we propose to model user preferences and item properties using a convolutional neural network (CNN) with attention, motivated by the huge success of CNN for many natural language processing tasks. By using aggregated review text from users and items, we aim to build vector representations of user and item using attention-based CNNs. These vector representations are then used to predict rating values for a user on an item. We train these user and item networks jointly, which enables the interaction between users and items in a way similar to the matrix factorization technique. In addition, the visualization of the attention layer gives us insight on when words are selected by the models that highlight a user’s preferences or an item’s properties. We validate the proposed models on popular review datasets, Yelp and Amazon, and compare results with matrix factorization (MF), and hidden factor and topical (HFT) models. Our experiments show improvement over HFT, which proves the effectiveness of these representations learned from our networks on review text for rating prediction.