Life as Clay

Posts Tagged ‘population pyramid

New Population Pyramid Generator

leave a comment »

I created a new OS X population pyramid generator application, called Pyramids. Click here to see it in the Mac App Store. It’s $0.99. If you have a Mac and need to make population pyramids with any regularity, or even just a single one, then give it a shot!



Performance of a free niche app in the Mac App Store

with 2 comments

I released the free Population Pyramid Generator in the Mac App Store at the end of last week. The tool is very simple. I tried to make sure that it was stable and that it does what it claims to do without trouble. I think that I succeeded on those fronts. There certainly is room for improvement, however, and I have plans to add several features in a forthcoming revision.

My expectations upon release of the application were that maybe 50-60 people would download it. Let’s be honest: it’s a niche tool, and while useful for people who need a population pyramid, it’s pretty useless to anybody else. Part of my motivation to create it was that my business website receives a lot of hits related to population pyramids because I wrote a blog post there about them. I thought that it would be nice to provide the tool and that it potentially could help me find additional clients.

What I didn’t expect is that aggregation sites like would pick up the app. It turns out that a lot of people are exposed to it through those sites. It’s appearance in Google rankings received a quick boost and traffic to my company’s website skyrocketed shortly after the app became available.

In the first 5 days, the Population Pyramid Generator was downloaded nearly 450 times — ten times my expectation. It went live around 4pm EST on 02/24.

Equally surprising was the breakdown by country. This view shows only the first 4 days (due to the week cutoff in the itunesconnect interface):

Considering that I did basically zero marketing of this app, what’s the lesson here? I think that simple free tools, especially ones that fill an unmet niche, can be relatively effective marketing tools for a business. All told, it took me ~ 1.5 weeks to build the Population Pyramid Generator. As people with need for such a tool find and download it, perhaps I’ll gain an additional client or two. That would be worth the time investment alone. In fact, the increase in Google rankings following the release of the app probably make it worth the time investment.

The Population Pyramid Generator is a fun experiment for me and I look forward to adding additional features to it, as time allows. I also plan to create a similar tool for Windows (and use it as a project to teach myself .NET and C#).

Written by Clay

March 1, 2011 at 08:40

DC as seen through population pyramids

with one comment

All of the recent work on parsing US Census data was part of a larger project — one that includes the dynamic generation of population pyramids for the entire population and for selected racial groups within each county in the United States. Residents of DC (like myself) frequently hear about how it is a terrible place for young women to meet boyfriends and date successfully. All data here are from the 2009 US Census Bureau population estimates.

Here’s what the population pyramids show:

For the final one, keep in mind that “Hispanic” is considered an ethnicity by the US Census Bureau and that most Hispanics also select a race on census forms and most people who select a race also indicate whether they also consider themselves Hispanic.

What really is striking here is the difference between the shapes of the white and black population pyramids. Perhaps a lot of young white people move to DC for congressional jobs and then move away when the job is finished. DC traditionally has a larger permanent black population and that is reflected in the more even distribution of the pyramid. However, DC also is known for having one of the least healthy black populations in the country, a fact reflected in the low numbers of elderly people. For comparison, look at this view of whites in Palm Beach, Florida:

Back to the original question — yes, you can see in the population pyramid that there are more females than males in Washington, DC, except in the population that identify as being ethnically Hispanic. Where do you go to find the opposite problem? One place is Honolulu, Hawaii:

Only Native Hawaiians (a group including other Pacific Islanders) show a normal distribution:

(All of these population pyramids were generated using CSS in a custom script written in Ruby on Rails.)

Written by Clay

July 26, 2010 at 12:40

Parsing US Census data with Ruby

leave a comment »

I am a fan of population pyramids as a visualization of age demographics. They are simple and effective. Sometimes getting the data to generate them isn’t….

I’m building a simple Rails app for a client who needs county-level population data. It doesn’t have to be precise. I went with the 2009 estimates from the Census bureau, located here:

There were a few issues:
1. Each state’s county-level data is in a different file. You can order a CD that has all of the data in a single file, but I didn’t have time to do that.
2. There’s a ton of data that I didn’t need in each file. All I needed were the county totals and the totals for male/female for each age group.
3. The data is in rows — a different row for each age group. I needed the data in columns, with one row for each county.

What I did:
1. Downloaded all 51 of the files (DC has its own file) and put them in a folder on my desktop.
2. Changed the extension on them to .csv (because it changed to .txt when I downloaded, for some reason).
3. Wrote the Ruby script below, which extracts the data that I need and puts it into a single .csv file.
4. Saved the script to the same folder as the Census files as script.rb.
5. Opened terminal and ran ruby script.rb
4. Reconciled the data with my Excel sheet that I was using to collect county-level data so that I could seed my Rails app’s counties table.

I comment my code pretty heavily when I write it so that I can look back and remember what I did. I’ve been burned in the past by not commenting thoroughly.

Here’s the script:

# Script to pull population data out of census files for each county
# Original data files are here:

require 'CSV'

# These are the FIPS codes for each state which are used in file names.
state_fips_codes = ["01", "02", "04", "05", "06", "08", "09", "10", "11", "12", "13", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35", "36", "37", "38", "39", "40", "41", "42", "44", "45", "46", "47", "48", "49", "50", "51", "53", "54", "55", "56"]

add_header = true

# This is the file that all of the data will go into..."county_age_data.csv", "w") do |output_file|
  # counter for iterating through the files
  n = 0
  state_fips_codes.length.times do  # This loop goes through all of the files
    # Read in the relevant file
    state_data = CSV.parse("cc-est2009-alldata-" + state_fips_codes[n] + ".csv"))
    puts "Processing data for " + state_data[1][3]

    # Add the header to the file
    if add_header == true
      output_file.puts "state_fips,county_fips,state_name,county_name,:census_total_pop,:census_total_male,:census_grp01_male,:census_grp02_male,:census_grp03_male,:census_grp04_male,:census_grp05_male,:census_grp06_male,:census_grp07_male,:census_grp08_male,:census_grp09_male,:census_grp10_male,:census_grp11_male,:census_grp12_male,:census_grp13_male,:census_grp14_male,:census_grp15_male,:census_grp16_male,:census_grp17_male,:census_grp18_male,:census_total_female,:census_grp01_female,:census_grp02_female,:census_grp03_female,:census_grp04_female,:census_grp05_female,:census_grp06_female,:census_grp07_female,:census_grp08_female,:census_grp09_female,:census_grp10_female,:census_grp11_female,:census_grp12_female,:census_grp13_female,:census_grp14_female,:census_grp15_female,:census_grp16_female,:census_grp17_female,:census_grp18_female"
      add_header = false

    # Iterate through the loaded file.
    row = 1
    while row < state_data.length
      if state_data[row][5].to_i == 12 # Only take data from year coded 12 (2009 estimates)

        ########## MAIN LOGIC #############
        if state_data[row][6].to_i == 0 # It's a new county because this is the group 0 row with totals

          # Reset the data arrays
          row_data    = []
          male_data   = []
          female_data = []

          row_data.push state_data[row][1].to_s     # state_fips
          row_data.push state_data[row][2].to_s     # county_fips
          row_data.push state_data[row][3].to_s     # state_name
          row_data.push state_data[row][4].to_s     # county_name
          row_data.push state_data[row][7].to_s     # census_total_pop

          male_data.push state_data[row][8].to_s    # census_total_male
          female_data.push state_data[row][9].to_s  # census_total_female

        elsif state_data[row][6].to_i == 18 # It's the last row -- push the data into the array

          # push in the final data points
          male_data.push state_data[row][8].to_s    # census_grp18_male
          female_data.push state_data[row][9].to_s  # census_grp18_female

          # append all of the data to the row_data array
          row_data += male_data
          row_data += female_data

          # write the data to the file
          output_file.puts row_data.join(",")

        else # It's not the first or last row

          male_data.push state_data[row][8].to_s    # census_grpXX_male
          female_data.push state_data[row][9].to_s  # census_grpXX_female

        end # End of the if statement for pushing data into arrays
        ################ END OF MAIN LOGIC ####################
      end       # End of the if statement to check if it is year 10
      row += 1  # Look at the next row

    n += 1      # Increment the counter
  end           # End of loop for going through each file

Written by Clay

July 15, 2010 at 21:17

Posted in Code, Ruby

Tagged with , , ,