This is a data product of the estimated composition of tree taxa at the time of European and Euro-American settlement of the northeastern United States. Composition is defined as the proportion of stems larger than approximately 20 cm diameter at breast height in 22 taxonomic groupings, generally at the genus level. The data come from settlement survey records that provide raw data that are transcribed and then aggregated spatially, giving count data. The domain is divided into two regions, eastern (Maine to Ohio) and western (Indiana to Minnesota). Public Land Survey point data in the western region are aggregated to a regular 8 km grid, while data in the eastern region, from Town Proprietor Surveys, are aggregated at the township level in irregularly-shaped local administrative units. The product is based on a Bayesian statistical model fit to the count data that estimates composition on a regular 8 km grid. The statistical model allows us to estimate composition at locations with no data and to smooth over noise caused by limited counts in locations with data. Critically, it also allows us to quantify uncertainty in our composition estimates. We expect this data product to be useful for understanding the state of vegetation in the northeastern United States prior to large-scale European settlement. In addition to specific regional questions, the data product can also serve as a baseline against which to investigate how forests and ecosystems change after intensive settlement. This material is based upon work supported by the National Science Foundation under grants #DEB-1241874, 1241868, 1241870, 1241851, 1241891, 1241846, 1241856, 1241930.