Interactive Downloader Tutorial

This page provides an in-depth explanation of the various uses and functions of our interactive downloader. The downloader is simple to use, but some if its more obscure elements may require explanation. Please read through this tutorial if you have questions about how to use it. Click the link here to go to the interactive downloader and follow along with this tutorial.

This video will walk through the basics of using the interactive downloader.

Basic Overview:

The interactive downloader allows users to access our database via a series of preformatted commands that are set with a few simple dropdown menus. There are seven dropdown menus that must be set: Country, Jurisdiction, Unit, Document Type, Series, Start Date, and End Date. There are also sub-menus that appear when certain options are selected in the main menus. Those will be discussed later on in this tutorial.

Each menu corresponds to a specific aspect of the data in our database that helps describe to the API what information you want to combine and download. Once an option is selected in each menu, click the download button to execute your command. This will query the database according to the parameters you selected, and download a .csv file with the specified information. Some commands will take much longer to pull results than others, generally because they require the API to assemble a large amount of data. If you are planning on pulling larger sets of data, please first check our bulk download page to see if the data you need is already available there. Alternatively, some familiarity with python or R will generally allow you to assemble complex datasets much more quickly by using our specially built libraries (See the “Use the API” section on this page).

Limitations:

There are two general rules about how the interactive downloader works. First, the downloader can only pull data for one series at a time. Users who want data for multiple series will have to go through the process multiple times. The downloader does, however, support the option to pull data for multiple subnational jurisdictions and multiple industries at once. Second, if an option does not appear in the dropdown menu for a given jurisdiction, that means the type of data specified is not currently available for that jurisdiction. The easiest way to tell what kind of data is available for a given jurisdiction is to select that jurisdiction and then check the available Unit, Document Type, and Series options in their respective dropdown menus.

Overview of Each Menu:

The Country menu selects between the countries for which we have regulatory data. It currently includes the United States, Canada, and Australia. Data can only be pulled for one country at a time.

The Jurisdiction menu selects between the jurisdictions of the country selected in the Country menu. Select National in this menu to pull federal level regulatory data for the country in question. Select the name of a specific state or province (for example, Florida) to pull data for just that jurisdiction. Select All Subnational Jurisdictions to pull data for all available states or provinces in the the country in question.

The Unit menu selects the level of text for which data is desired. Aggregate is the default level, and is what will probably be preferred by most users. As the name suggests, it aggregates data across the entire regulatory code in question. The Document level, on the other hand, provides data at the level of every unit of text for which data was gathered. The “document” is an important concept in text analysis; it is the distinct unit within a larger body of text that is compared and measured against other similar units. For the US Code of Federal Regulations (CFR), we used the the “Part” level of regulation as our documents. For other regulatory codes, the document level corresponds to the level of regulation that most resembles a CFR part in length and specificity. If, for example, you are interested in word count data in the CFR, selecting the Aggregate option in the Unit menu will provide a sum of the total number of words in the entire CFR for the year(s) you specified. If you selected the Document option instead, you would receive a word count for each CFR part.

The options in the Document Type menu will depend on the country and jurisdiction you selected. All Regulations is the standard option for all US jurisdictions. It provides data for all available documents in a regulatory code. Other options provide data for documents with a certain kind of regulation (or, in some cases, provides data on the likelihood that each document is composed of a certain kind of regulation). The Occupational Licensing option pulls occupational licensing data for those US States that have this data available (See Occupational Licensing RegData). The Healthcare option is available only at the US federal level and pulls healthcare data (See Health RegData). The US Electronic Code of Federal Regulations option is available only at the US federal level and pulls daily CFR data (See RegData US Daily). None of the aforementioned options are available for any Canadian or Australian jurisdiction. Instead, these jurisdictions each have the All Regulations and All Statutes options. This is because Canada and Australia both have parliamentary systems of government that do not promulgate regulation in the same way as the US system. Consequently, Australian and Canadian regulations can be found in two different types of documents: “Statutes,” and “Regulations.” This difference in systems also explains why agency data is not available for either of these two countries (See RegData Canada and RegData Australia).

The Series menu allows the user to stipulate which kind of data they want for the jurisdiction and document type they have selected. These may vary by unit, document type, and jurisdiction. In general Series options include: Total Words, Total Restrictions, options to get counts for each specific restriction phrase (I.e. Terms-May Not, Terms-Shall, etc.), Restrictions by Industry, Restrictions by Agency, Restrictions by Agency and Industry, and several complexity metrics (I.e. Flesch Reading Score, Sentence Length, etc.). Certain Document Types may have unique Series options. For example, occupational licensing document types have an Occupational Licensing Probabilities and Occupational Licensing Restrictions option, while eCFR (RegData US Daily) documents have the Total Deregulatory Terms option. These options are best explained by the relevant user guides and documentation (See links above).

There are a few quirks with the Series menu. First, the restriction by industry, agency , and by agency and industry options will each create sub-menus. These are used to select the agencies and industries you want data for. At least one option MUST be selected in these menus for the database query to work correctly. Each submenu has an All Agencies or All Industries option at the top. Industries and/or agencies may also be selected individually, or in groups using the shift+click and shift+ctrl+click keyboard commands or their equivalent. Industry data should be available for most jurisdictions, but agency data is only available for the US federal level and certain states. Additionally, the only metric available by agency or industry is the restriction count metric. Complexity metrics and other word counts are NOT available by agency or industry.

Finally, the Start Date and End Date menus simply specify the range of years (or days) for which data is desired. Many jurisdictions, series, and document types are only available for single years. The easiest way to tell how many years of a certain type of data are available for a given jurisdiction is to fill out the rest of the menus and see which options appear in the Start Date and End Date menus.