Skip to Main Content

Data and Statistics

Resources for finding and using data and statistics. This guide focuses on data for the social sciences.

Rich Data Services

Statistics Canada provides an analytical platform called Rich Data Services (RDS) for exploring Public Use Microdata files (PUMFs). RDS has an Explorer section and a Tabulation Engine that allows users to browse, interact with, and download data and metadata. RDS allows users to manipulate data and perform basic statistical operations without the more specialized knowledge needed for other statistical platforms like R and Stata.

RDS Explorer allows users to tailor datasets to their specific research needs through variable selection and filtering. RDS Tabulation Engine allows users to cross-tabulate data, create summary statistics, and run regressions. Both services allow users to download their results in various formats.

RDS Explorer

Click on RDS Explorer.

 

Change Data Product allows you to explore different datasets.

You can select a dataset by clicking on the title of the dataset.

 

NOTE: You can return to the RDS home page by clicking HOME in the top left, but your work will NOT be saved. Be sure to download your data first or return to the home page in a different tab.

For more detailed instructions, see the DLI's RDS user guide.

Explore brings you to the dataset you have selected, allowing you to explore and modify it.

 

If you go through the tabs at the top:

  • Clicking Details allows you to see the description and methodology of the dataset: who the sample population is, an overview of the population characteristics of interest, how the data was collected, etc.

 

  • Clicking Data brings you to a view of the data--each row is a different observation and each column is a different variable.
  • You can navigate through the variables (columns) and observations (rows) by clicking on the arrows in the upper right and lower right respectively.

 

  • Clicking on a variable header gives you access to further metadata about each variable.
    • Details provides variable characteristics, i.e. what interview question or population feature the variable corresponds to, the variable's range of valid values, etc.
    • Codes lists the possible variable values--both the numerical values and their associated meaning.
    • Filter allows you to select one or more variable values and filter the entire dataset by that value.
    • Stats provides a visualization of some summary statistics on that variable.

 

  • Dictionary provides a complete description of the variable(s), including their type, definitions for each value, number of decimals if applicable, etc. This detailed description is known as a data dictionary or codebook.

 

You can select variables in multiple ways:

  • First, by going to Data, clicking on variable, and then hitting SELECT.
  • Second, by going to Dictionary, clicking on variable, and then hitting SELECT.
  • You can also choose to SELECT ALL and then UNSELECT ALL.

 

Once you have selected a variable, you can filter your dataset by SELECTED/UNSELECTED variables.

1

 

Selected variables are also put in your Package. Your data filters are preserved there.

 

For more detailed instructions, see the DLI's RDS user guide.

Selected variables are also put in your Package. Your data filters are preserved there.

Go to Package to download your selected variables.

 

Clicking ADD OUTPUT FORMAT(S) will give you the following options:

  • Bundles allows you to choose multiple formats together in a "bundle" by common usage.

    • You can edit the bundles once they have been selected.

 

  • All Formats allows you to select the specific formats you want.

 

  • Once you have chosen, click SELECT.

  • You can click RESET OUTPUT FORMATS to undo your output selection.

  • Click PACKAGE AND DOWNLOAD to download your dataset in the format(s) you have chosen.

 

For more detailed instructions, see the DLI's RDS user guide.

RDS Tabulation Engine

Click on Tabulation Engine.

 

Change Data Product allows you to explore different datasets.

You can select a dataset by clicking on the title of the dataset.

 

NOTE: There does not seem to be a way to return to the RDS home page from here, so you may wish to open the Tabulation Engine in its own tab. Your work will NOT be saved if you exit the Tabulation Engine, so be sure to download your table first or return to the home page in a different tab.

For more detailed instructions, see the DLI's RDS user guide.

Data Product brings you to the dataset you have selected.

If you go through the tabs at the top:

  • Clicking Details allows you to see the description and methodology of the dataset.

 

  • Clicking Dictionary provides a complete description of the variable(s), including their type, definitions for each value, number of decimals if applicable, etc. This detailed description is known as a data dictionary or codebook.

 

  • Clicking on a variable header gives you access to further metadata about each variable.
    • Details provides variable characteristics i.e. what interview question or population feature the variable corresponds to, the variable's range of valid values, etc.
    • Codes lists the possible variable values--both the numerical values and their associated meaning.
    • Stats provides a visualization of some summary statistics on that variable.

 

For more detailed instructions, see the DLI's RDS user guide.

Once you have chosen a data product, you can go to Tabulation, which brings you to the table you are building.

  • By selecting Rows and Columns, you can choose variables to make up your rows or columns.
    • To select a variable, you can scroll through the list of variables or search using a key word.
  • Once you have chosen your row and column variables, you can choose the measure you want displayed in your table by selecting Measure.
    • These measures are typical summary statistics, i.e. count, percent, sum, etc.
    • Some measures require you to choose a variable.
  • By selecting Filter, you can filter your table by the specific value of another variable, i.e. you can limit your sample to a specific population or region.
  • By selecting Weight(s), you can choose to adjust the sample by selecting a weight variable.
    • Reweighing the sample accounts for discrepancies between the distribution of groups within the sample population versus the estimated distribution within the actual population of interest.
    • Check the description of the data for how the data is weighted.

  • To switch rows and columns, click the button next to the DOWNLOAD button.
  • To reset your table, click the button next to the TABULATE button.
  • To tabulate your data, click TABULATE.
    • You can make changes to your table and then click TABULATE again.

 

For more detailed instructions, see the DLI's RDS user guide.

Once you have your table, you can download it by clicking DOWNLOAD, selecting an output format, and then clicking DOWNLOAD again from the pop-up.

  •  Clicking DOWNLOAD also saves your table to Data Extract(s). You can decide to save multiple tables and then go to Data Extract(s) to pull them all up later.

 

For more detailed instructions, see the DLI's RDS user guide.

Once you have chosen a data product, you can go to Regression, which allows you to run a least squares regression on variables from the dataset you currently have selected.

  • By selecting Dependent Variable and Independent Variable, you can choose variables to make up your variables of interest.
    • To select a variable, you can scroll through the list of variables or search using a key word.
    • You can select multiple independent variables.
  • By selecting Filter, you can filter your variables of interest by the specific value of another variable, i.e. you can limit your sample to a specific population or region.
  • By selecting Weight(s), you can choose to adjust the sample by selecting a weight variable.
    • Reweighing the sample accounts for discrepancies between the distribution of groups within the sample population versus the estimated distribution within the actual population of interest.
    • Check the description of the data for how the data is weighted.

 

  • To reset your table, click the button next to the CALCULATE button.
  • To run your regression, click CALCULATE.
    • You can make changes to your table and then click CALCULATE again.

 

For more detailed instructions, see the DLI's RDS user guide.

Once you have the results of your regression, you can download the resulting table by clicking DOWNLOAD, selecting an output format, and then clicking DOWNLOAD again from the pop-up.

  •  Clicking DOWNLOAD also saves your regression results to Data Extract(s). You can decide to save multiple regressions and then go to Data Extract(s) to pull them all up later.

 

For more detailed instructions, see the DLI's RDS user guide.