County-Level Geostratification of the Primary Demographic Factors for the U.S. Population


MEDIAN HOUSEHOLD INCOME: ILLUSTRATED AS A USCB COUNTY-LEVEL CHOROPLETH CHART
Geographical indices across the U.S. for income.
It's been a while since last I posted anything. I've been feeling the need to focus more upon driving an understanding of data science rather than upon any particular toolset. For example, I've been focusing too much on information visualization. Not that there's anything wrong with info. viz., but it is a means to an end for me where that end is assisting the decision-making process. I also haven't felt much inclination to publish since I feel I often exhaust the reader with dreariness or with overly exaggerated detail....

These things considered, today I was feeling epistemological so I decided to wax about it for a second: Consider the infinite number of possible attributes we could consider examining. Then each of these attributes has a set of possible values. Not every value for a given attribute is equally likely to occur. The relationship between the likelihood of each value often relate to various external factors such as time or space. This seems simple, almost mundane. Yet, this simple observation summarizes more than 90% of the analysis that I've seen used for a typical decision making process. I think the tendency to immediately pursue an algorithm or complex research design betrays the need to first focus upon basic fundamentals. These fundamentals consist of 1) defining the concepts related to an area of inquiry, 2) examining the types of values associated with each of these concepts, and 3) exploring the relationship between these values across time and space. Since I always see longitudinal plots in business analyses, I tend to focus more upon geospatial relationships within my blog because I feel they are under utilized. Simple geospatial charts convey complex relationships; color-coded charts are powerful because they tap into our innate ability to visually detect patterns.

So, I gathered a whole bunch of data from the U.S. Census Bureau and decided to illustrate the geostratification of some of the more common demographic factors. I will illustrate age, gender, income, education, and ethnicity. Any variance in the values of these social constructs by region will illustrate insights useful for understanding the social stratification within our own culture.

AGE: ILLUSTRATED AS A USCB COUNTY-LEVEL CHOROPLETH CHART
Geographical indices for U.S. age population.
The older populations tend to more densely populate the rural areas surrounding larger metropolitan communities.

GENDER: THE PERCENTAGE OF POPULATION THAT IS FEMALE ILLUSTRATED AS A USCB COUNTY-LEVEL CHOROPLETH CHART
Geographical indices across the U.S. for gender.
It seems as though there are more women in the South and along the coasts- perhaps they like warmer weather?

MEDIAN HOUSEHOLD INCOME: ILLUSTRATED AS A USCB COUNTY-LEVEL CHOROPLETH CHART
Geographical indices across the U.S. for income.
The higher household incomes tend to co-occur with larger metropolitan communities. Could this be an artifact of using the median to measure this value?

EDUCATION: PERCENTAGE OF INDIVIDUALS OBTAINING BACHELOR'S OR HIGHER DEGREES SHOWN AS A USCB COUNTY-LEVEL CHOROPLETH CHART
Geographical indices across the U.S. for education.
Education tends to also co-occur with metropolitan communities. Perhaps there is a covariant relationship between education and income?

ETHNICITY: ILLUSTRATED AS A SERIES OF USCB COUNTY-LEVEL CHOROPLETH CHARTS, ONE FOR EACH OF THE FIVE LARGEST GROUPS

WHITE, NON-HISPANIC POPULATION PROPENSITY
Geographical indices for U.S. White, Non-Hispanic population.
This group seems to densely populate most of the country, however, there are distinct areas where this group is less densely populated.

HISPANIC POPULATION PROPENSITY
Geographical indices for U.S. Hispanic population.
The hispanic group seems to more densely populate the area in the Southwest of the country. This could account for part of the gap seen in the white, Non-Hispanic population.

BLACK, OR AFRICAN-AMERICAN POPULATION PROPENSITY
Geographical indices for U.S. Black or African-American population.
The black, or African-American population tends to more densely populate the South. This seems to account for the other gap seen in the white, Non-Hispanic population.

ASIAN ALONE POPULATION PROPENSITY
Geographical indices for U.S. Asian population.
The Asian population tends to more densely populate the coasts.

ASIAN OR PACIFIC-ISLANDER POPULATION PROPENSITY
Geographical indices for U.S. Pacific Islander population.
Notice how each group seems to have distinct density patterns. This illustrates how powerful these charts are as an analytical tool and this also illustrates why geostratification is an important analysis tool to use in conjunction with longitudinal analysis.

The fine print: these charts are called choropleth charts. I created these using d3. However, I'm illustrating these as static images in order to decrease page load times. The charts you see here illustrate the value for each zipcode indexed against the overall propensity for that attribute. This normalization process then allows these charts to illustrate differences in relative degree rather than absolute differences. I use only 9 different colors to populate these charts. I intentionally chose the colors that I am using to both bolster usability and efficacy. All nine colors are actually just different shades of the same color. Furthermore, the colors within these charts will diverge as the values within any given region move further away from the expected value.

Data Sources: U.S. Census Data

Tags: