Skip to content

Running the Population Synthesizer

Alex Bettinardi edited this page Aug 1, 2023 · 28 revisions

Introduction to Population Synthesis

The population synthesizer software PopulationSim is used to generate a synthetic population for ABM. Users need to review the PopulationSim wiki before attempting to modify or use the tool.

PopulationSim for the 2010 ABM has been set up to use the following constraints (controls) for generating synthetic population for the residential population:

  • Meta:
    • Number of Persons by Age Group (0-5, 6-12, 13-15, 16-17, 18-24, 25-34, 35-44, 45-54, 55-64, 65-74, 75-84, 85+)
    • Number of Persons by Occupation Category (1, 2, 3, 4, 5, 6, as defined in the table Occupation Categories in Synthetic Population
  • TAZ:
    • Households by Income in $2010 ($0-$15k, $15k-$25k, $25k-$35k,$25k-$50k,$50k-$75k,$75k-$100k,$100k-$150k,$150k+)
    • Households by Household Size (1,2,3,4+)
    • Households by Workers per Household (0,1,2,3+)
    • Households by Presence of Children (Without, With)
  • MAZ:
    • Total Households
    • Households by type (Single Family, Multi-Family, Mobile Home, Duplex)

Occupation Categories in Synthetic Population

Number Description Census 2-digit SOC Occupation Codes
1 Management, Business, Science, and Arts 11, 13, 15, 17, 19, 27, 39
2 White Collar Service Occupations 21, 23, 25, 29, 31
3 Blue Collar Service Occupations 33, 35, 37
4 Sales and Office Support 41, 43
5 Natural Resources, Construction, and Maintenance 45, 47, 49
6 Production, Transportation, and Material Moving 51, 53, 55

The final synthetic population contains both residential and group quarters population. The residential and group quarter population are totally independent of each other and have separate controls. Therefore, the process of generating synthetic population is simplified by running PopulationSim separately for residential population and group quarter population. The final outputs from the two runs are combined in a post processing step to produce the final input synthetic population for SOABM. PopulationSim has been setup to use the following controls for generating the group quarter synthetic population for the 2010 ABM:

  • Meta:
    • Number of Persons in Group Quarters
  • MAZ:
    • Number of Group Quarter Households
    • Number of Group Quarter Households by Type (Major University, Other University, Military, Others)

For future year scenarios, or scenarios in which the number of households needs to be changed for a significant portion of the region, all controls will need to be updated. For land-use scenarios in which households in only a few MAZs or TAZs need to be changed (e.g. traffic impact studies), the PopulationSim tool can be run in RePopulate mode which keeps the population unchanged for all zones except the select handful that have been identified to be updated.

Creating a New Synthetic Population

In order to create an entirely new synthetic population, you must first update the data in the PopulationSim control files with new control totals. Note that each control, for each level of geography, must be consistent. In other words, total households at the TAZ level for any given control should match total households specified for all MAZs with the TAZ. Population controls should also be consistent with each other, and with household controls. For example, the total number of persons by age should be consistent with the total number of persons across all TAZs implied by the households by size distribution. The total persons by occupation should be consistent with the total implied workers by the Households by Workers Per Household distribution. Also, the controls on Households by Workers Per Household and Persons by Occupation should be consistent with the total employment in the input MAZ file for the given scenario. Either the workers per job ratio should be held constant from the base year or there should be a good reason why it would be different.

For the ABM, 3 separate Synthetic Populations are generated and then merged together to form the complete household and person records input to the ABM. These three populations are:

  • The General Population (GP),
  • The Group Quarters (GQ) Population, and
  • A Visitor Population.

The discussion on this page focuses primarily on the process for the general population, however the fields input into the ABM, which are listed below, are the same for all of these inputs (as the different populations are merged together before input into the ABM). The Group Quarters is generated in the same way as the general population (using the same basic approach and tools), except that the group quarters has much less information to control the population generation with, so it is basically just controlled at the MAZ level with total group quarter units (the same as persons) by group quarters type;

  • Civilian,
  • College / University, and
  • Military.

(Noting that institutionalized group quarters do not need to be generated because their travel is effectively non-existent / not allowed).

The visitor population is generated with a different approach and logic than the general population and group quarters. Again, the visitor population final tables (fields) are the same. This different methodology for the visitor population is described here.

Most of the inputs can be user specified. However in Oregon, Population Totals by jurisdiction are established by the Population Resource Center at Portland State University. The synthetic population development for the ABM needs to follow a series of steps in order to ensure that these set population values are achieved to the extent possible. The steps to follow to create a synthetic population consistent with statewide estimates is found here.

Update the Synthetic Population

For given ABM scenarios that involve changes to households and/or the population, one or all of the various populations may need to be regenerated. A specific example of this is for future year scenarios. For the work described in this wiki, the future year population assumes an aged population (a population that is on average older than current conditions). There are specific steps that were taken to attempt to correctly adjust all the demographics that go along with an again population. The steps that were take for the future year population are described here.

Update the Synthetic Population for a Subset of MAZs

To update the synthetic population for a subset of MAZs without changing the synthetic population in the rest of the region, use the RePopulate feature of PopulationSim. This feature does not require that you re-specify all of the controls used to generate the initial synthetic population. You only need to specify new controls at the MAZ level. You can use total households, households by type, or any other control, though different controls from the ones used in the creation of the initial synthetic population will require more coding. Please review the PopulationSim page and test dataset for more information on use of this feature.

Household File Format

The following table describes the synthetic population household file format that is input to CT-RAMP in the current version of SOABM. The fields marked with an asterisk are not read by the CT-RAMP software, but may be used by other processes to summarize results.

Note that the household file must be sorted by hhid before getting input into ABM, which should be sequentially numbered from 1 to total number of households.

Field Description Values
PUMA* PUMA ID of the household 800,900 for 2010. 800,901,902 for current and future years.
taz TAZ number
maz MAZ number 1-max maz number (sequential). Must match the SEQMAZ value which is handled by the code.
WGTP* Initial weight of the househols in the PUMS sample
serialno* Original serial number in the PUMS sample
gqflag Binary variable indicating group quarter 0=non-GQ, 1=Non-institutional GQ
gqtype* Group Quarters Type 1=University, 2=Military, 3=?, 4=Civilian
htype* Household Type 1=Single-family, 2=Multi-family, 3=Mobile-home, 4=Duplex
nwrkrs_esr Number of workers number of workers
hhincadj Household income In $2010 dollars
hhchild* Number of children in the household
np Household size 1-number of persons
hincp* Unadjusted household income In year of PUMS record
ten* Tenure 1=Owned/Mortgage, 2=Owned/Free, 3=Rented, 4=Occupied
bld* units in structure 1=Mobile home or trailer, 2=One-family house detached, 3=One-family house attached, 4=2 Apartments, 5=3-4 Apartments, 6=5-9 Apartments, 7=10-19 Apartments, 8=20-49 Apartments, 9=50 or more apartments, 10=Boat, RV, van, etc.
adjinc* Income adjustment factor
veh Number of autos (overwritten by auto ownership model)
hht Household/family type -8=N/A (GQ/vacant),1=Married-couple family household, 2=Other family hh, Male householder, no wife,3=Other family hh,Female householder, no husband, 4=Nonfamily hh, Male householder, living alone, 5=Nonfamily hh, Male householder, not living alone,6=Nonfamily hh, Female householder, living alone, 7=Nonfamily hh, Female householder, not living alone
type* Type of unit 1=Housing unit, 2=Institutional group quarters, 3=Noninstitutional group quarters
npf* Number of persons in family (unweighted) 02-20 .Number of persons in family
hupac* HH presence and age of children -8=N/A (GQ/vacant), 1=With children under 6 years only, 2=With children 6 to 17 years only, 3=With children under 6 years and 6 to 17 years, 4=No children
hhid Household ID (file must be sorted in order of hhid)

Person File Format

Note that the person file must be sorted by hhid, which should be sequentially numbered from 1 to total number of households. PERID is sequentially numbered from 1 to total number of persons after sorting by hhid

Field Description Values
PUMA* PUMA ID of the household 800,900
taz TAZ number
maz MAZ number 1-max maz number (sequential)
WGTP* Initial weight of the households in the PUMS sample
serialno* Original serial number in the PUMS sample
sporder Person number within a household
employed Recoded from ESR 0=unemployed, 1=employed
soc* Occupation code
occp Person occupation 1=Management, Business, Science, and Arts, 2=White Collar, 3=Blue Collar, 4=Sales and Office Support, 5=Natural Resources, Construction, and Maintenance 6=Production, Transportation, and Material Moving 999=not a worker
gqflag* Binary variable indicating group quarter 0=non-GQ, 1=Non-institutional GQ
gqtype* Institutional, non-institutional
agep Person age Number of years
sex Gender 1=male, 2=female
wkhp Hours worked per week (less than 35 is part-time) Number of hours
esr Employment status recode (1,2,4,5 identify worker) -8=less than 16 years old/did not work, 1=Civilian employed, at work, 2 =Civilian employed, with a job but not at work, 3=Unemployed, 4=Armed forces, at work, 5=Armed forces, with a job but not at work, 6=Not in labor force
schg School grade attending -8=under 3 years or not enrolled, 1=Nursery school or preschool, 2=Kindergarten, 3=Grade 1 to grade 4, 4=Grade 5 to grade 8, 5=Grade 9 to grade 12, 6=College undergraduate, 7=Graduate or professional school
wkw Weeks worked in past 12 months (less than 27 is part-time) -8= less than 16 years old/did not work, 1=50 to 52, 2=48 to 49, 3 =40 to 47, 4=27 to 39, 5=14 to 26, 6=less than 14
mil* Military indicator -8 = less than 17 years old, 1 = Yes, now on active duty, 2 = Yes, on active duty during the last 12 months, but not now, 3 = Yes, on active duty in the past, but not during the last 12 months, 4 = No, training for Reserves/National Guard only, 5 = No, never served in the military
schl Educational attainment -8=Under 3 years,1=No schooling completed, 2=Nursery school to 4th grade, 3=5th grade or 6th grade, 4=7th grade or 8th grade, 5=9th grade, 6=10th grade, 7=11th grade, 8=12th grade, no diploma, 9=High school graduate, 10=Some college, but less than 1 year,11=One or more years of college, no degree, 12=Associate degree, 13=Bachelor’s degree,14=Master’s degree, 15=Professional degree, 16=Doctorate degree
majoruni Student of major university 1=major university student (0 = not a major university student)
PERID Person ID (Secondary sort field for file) 1-total number of records in file (sequential)
hhid Household ID (Primary sort field for file) 1-number of households (sequential)
Clone this wiki locally