This dataset includes lake total phosphorus (TP), true water color, and chlorophyll a (CHLa) concentrations from summer, epilimnetic water samples and is a subset of the larger LAGOS database (Lake multi-scaled geospatial and temporal database, described in Soranno et al. 2015). LAGOS compiles multiple, individual lake water chemistry datasets into an integrated database. We accessed LAGOSLIMNO version 1.040.0 for lake water chemistry data and LAGOSGEO version 1.02 for lake catchment geographic data. In the LAGOSLIMNO database, lake water chemistry data were collected from individual state agency sampling and volunteer programs designed to monitor lake water quality. Water chemistry analyses follow standard lab methods. In the LAGOSGEO database geographic data were collected from national scale geographic information systems (GIS) data layers. Lake catchments, defined as 'The area of land that drains directly into a lake, and into all upstream-connected, permanent streams to that lake exclusive of any upstream lake watersheds for lakes greater than or equal to 10 ha that are connected via permanent streams', were delineated for lakes greater than or equal to 4 ha. Lake-stream connectivity type was assigned to lakes greater than or equal to 4 ha using GIS tools that use the National Hydrology Dataset (See Soranno et al. 2015 for LAGOS geographic processing steps). A subset of lake and geographic data was created to examine spatial variation in TP and water color relationships with CHLa across broad geographic extents using spatially-varying coefficient models with a Bayesian framework. Lakes were selected that had complete records for summer epilimnetic total TP, true water color, and CHLa. In addition we selected lakes with surface area greater than or equal to 4 ha and less than 10,000 ha to exclude very small and very large lakes from the analyses. The resulting dataset includes 838 lakes in Wisconsin, Michigan, New York, and Maine with 7395 observations. The majority of lakes in the data subset have only one water chemistry observation (~72% of lakes). There are 228 lakes with more than one water chemistry observation taken on different sampling occasions over time (average of 29 observations per lake with repeated measures). The dataset reports the original, individual measurements. The proportion of agriculture and wetlands in the lake catchment were derived from land cover and land use data in the National Land Cover Dataset (2006). For the analyses we withheld ten percent of the observations for model validation and to assess prediction accuracy. The remaining observations were used in the model building steps. The 'dataset' column in the data indicates whether the observation belongs to the model-building ('mb') or hold-out dataset ('h').