What Is QuantGov?
Governments—federal, state, and local—produce enormous amounts of policy text, including the text of laws, regulations, trade agreements, treaties, court decisions, and even public speeches. For example, consider regulations. It would take the average person over three years to read the entire US Code of Federal Regulations (CFR), which contains more than 100 million words and counting. The sheer size and complexity of policy documents make it nearly impossible to tackle “big picture” issues, such as the cumulative effects of federal and state regulation.
Still, the details of policy-relevant documents matter for everyone—individuals, businesses, innovators, and public sector employees. To that end, Mercatus researchers Patrick McLaughlin and Oliver Sherouse created QuantGov, an open-source policy analytics platform designed to help create greater understanding and analysis of the breadth of government actions. The platform allows researchers to quickly and effectively examine bodies of text using some of the latest advances from data science, such as machine-learning and other artificial intelligence technology. The platform opens up new lines of research in areas rarely explored before, helping us gain a broad yet detailed picture of government policies and allowing us to scientifically test assumptions and theories about those policies’ causes and effects.
QuantGov can quickly search through thousands of pages of text for specific words or phrases. For example, QuantGov powered the datasets produced for the RegData Project, which used this feature of QuantGov to quantify the number of restrictive words in federal regulatory text. Similarly, if you wanted to know how many restrictions were in effect in a single year and compare that data point with that of another year, you could compare the relevant restrictive words between the two years.
QuantGov allows users to apply the industry classification algorithm developed in the RegData Project to other bodies of text besides the Code of Federal Regulations. In addition to identifying regulatory restrictions, the RegData Project’s classification algorithm tells us which industries are likely affected by each restriction. If you wanted to know how relevant a policy document is to a specific industry—from banking to cheese production—you could use the industry classification algorithm from the RegData Project to figure that out. For example, you could search through court decisions to see which industries might be affected. Or you could even modify the classification algorithm for use on other subjects besides industry relevance.
Topic Modeling & Clustering
You can reference certain bodies of text and the algorithm will work on its own to cluster them. It can identify several topics that occur across all documents and will go beyond referencing just the specific term, but will search for like terms. For example, you can use QuantGov to analyze how many times a policymaker discusses a broad issue.