Skip to main content

Conduct a representative sample | Be US Census Bureau’s Chief Analyst (Lesson 3 of 5) | 6-8

Student Objective

Students will be able to:
1. Conduct a representative, random sample and compare it to the population, using a simulation in google sheets.


Materials Needed:

Step 1:  Own It Activity

  • Think-Pair-Share: Ask students, “How would you conduct a representative sample of 100 one-ton-crates of meat?”

Step 2:  Introduce Enduring Question

  • Last lesson, we asked ourselves, “What conditions are required for a sample to representative of its population?”
  • Today, we ask, “How do we conduct a 10% random sample from a population using Google Sheets?

Step 3: Model How to Randomly Select Observations in Dataset

  • Model for students, as they do it with you:
    • Open up, “FakeData_Students”
    • In the tab, “Sheet1,” click the cell under the words, “Random Students” in column K
      • Check to make sure you can see the following google code in the cell:
        • =ArrayFormula(Array_Constrain(vlookup(Query({ROW(A1:A101),randbetween(row(A1:A101)^0,9^9)},“Select Col1 order by Col2 Asc”),{row(A1:A101),A1:A101},2,FALSE),100*0.1,1))
      • Double click into the cell, highlight the entire google code, and copy it by pressing “CTRL+C”
      • Click out of the cell
      • Click back on the cell
      • Press “CTRL+V” to paste
      • The numbers should now have changed — these are the SERIAL numbers of households in our 10% random sample
        • we have just run a random lottery, where we grabbed 10% of lottery balls, but in this case they are households
    • Explain to students:
      • Each time you press “CTRL + C” into that cell, you re-run the randomization (or the lottery)
      • “You’ve now conducted a 10% random sample, but you only have the SERIAL identifier of each household.  We have to make sure we get the household data that comes with the households themselves.”
    • Have students:
      • Look at tab, “10%RandomSample” and you’ll see that the SERIAL numbers sample are connected to the household data!”
      • Test and try out:
        • have students run a new sample
        • check the “10%RandomSample” tab to see if the SERIAL numbers match with their data

Step 4: Have Students Conduct a Random Sample 

  • Have students open up, “NewYorkDataOnly_Students”
  • Have students conduct a random sample on their NewYorkDataOnly_Students on their own


  • The sample will take a 5-7 minutes to finish and calculate.
  • Use the extra time to teach students how the function works for conducting a random sample
    • =ArrayFormula(Array_Constrain(vlookup(Query({ROW(A1:A101),randbetween(row(A1:A101)^0,9^9)},“Select Col1 order by Col2 Asc”),{row(A1:A101),A1:A101},2,FALSE),100*0.1,1))
      • Sections of function:
        • ROW(A1:A101) –> A1:A101 means the cells where my SERIAL numbers are
        • 100*0.1 –> a calculation for the number of observations I want to sample.  100 = all observations in data; 0.1 = 10%; 100*0.1 = 10% of 100 observations (total of 10)

Step 5: Have Students Calculate Statistics from Their Random Sample and Compare to Population Statistics

  • Using Google Sheets Pivot Tables, have students calculate the average value of each variable for:
    • their random sample
    • their total NY 2010 population
  • Think-Pair-Share:
    • Have students compare their population statistics with their sample statistics
      • Are they the same?
      • Are they different?  How different?
      • Can you use this sample to represent the population?

Step 6: Stamp and End Lesson, Introduce Next Topic

  • In this lesson, we asked ourselves, “How do we conduct a representative 10% random sample?”
  • Tomorrow, we will learn how to make statistical inferences from our sample statistics when we compare them to other population statistics from a different sample of the same information/




EdTech used in this activity:

Google Sheets

Alternative Ed Tech you could use:

No items found