Why It Matters
A new Congressional Research Service report examines the rocky history of Data.gov implementation and raises questions about whether the federal government's flagship open data portal is living up to its statutory mandate. The report arrives at a moment when the Trump Administration has been pulling datasets from public view, and a key Office of Management and Budget (OMB) oversight report remains years overdue.
Data.gov, the government's primary public data portal, is legally required under the OPEN Government Data Act (Title II of the Foundations for Evidence-Based Policymaking Act of 2018, P.L. 115-435) to serve as a "single public interface online as a point of entry" for federal data. But the CRS report (published May 21, 2026) finds that agencies have broad discretion to decide what data to include, that the catalog's accuracy is unreliable, and that OMB has failed to file a congressionally required biennial compliance report. leaving Congress and the public without a clear accounting of whether agencies are meeting their obligations.
At the same time, media reports throughout 2025 documented widespread removal of federal datasets from agency websites, and the National Press Club issued a statement in March 2026 on "the Elimination of Data from Federal Agency Websites." The report notes that "some observers are also tracking the removal of specific datasets, variables, and tools."
The Big Picture
Data.gov was launched in May 2009 by the Obama Administration as part of its open government initiative, described at the time as "a one-stop website for finding accessible and free government information in open formats." The General Services Administration administers the site's day-to-day operations, while the Office of Management and Budget exercises control over implementation and issues guidance to agencies.
The site functions primarily as a directory, not a warehouse. It harvests metadata (including titles, descriptions, keywords, and links) from agency websites and aggregates them into a federal data catalog. The underlying data assets themselves typically remain hosted on individual agency servers. As CRS notes, this creates a fundamental reliability problem: the catalog "may indicate that a dataset is available, but the link provided to the agency's website could show an error message either for a technical problem or because an agency has removed it from their site."
Congress codified Data.gov's functions into law in January 2019, requiring agencies to build comprehensive data inventories and to submit public data assets for inclusion in the federal catalog. The law also required OMB to issue implementation guidance. That guidance, Memorandum M-25-05, did not arrive until January 15, 2025, six years after enactment and just days before the Biden Administration left office.
The Government Accountability Office flagged the missing guidance in an October 2020 report, warning that "without OMB's report on agency performance and compliance with the OPEN Government Data Act, Congress and the public lack key information about the extent to which agencies are meeting their requirements." Senator Charles Grassley sent a letter to OMB in September 2023 pressing for action. GAO reported that as of March 2025, it had received no further updates on OMB compliance with the statutory reporting requirement.
Definitional Disputes
One of the more technical but consequential findings involves how "data asset" is defined. The statute defines it broadly as "a collection of data elements or data sets that may be grouped together." But M-25-05 interpreted that definition more narrowly, requiring data to be both structured, such as information organized in columns and rows, and logically grouped. That narrower reading could exclude unstructured information, even if it would otherwise be releasable under a Freedom of Information Act request.
The CRS report notes the resulting ambiguity "arguably prevents clarity in advance regarding what data might be included in the agency CDI," referring to the comprehensive data inventories (CDI) agencies are required to maintain. Agencies may also "exercise reasonable discretion in determining whether any particular collection of data is sufficiently structured to be considered a data asset," according to M-25-05, language that effectively gives agencies a wide lane to exclude information.
The report is direct about the limits of Data.gov functions as an authoritative source. Because the catalog reflects metadata harvested from dozens of agency systems, it is vulnerable to errors at multiple stages: agencies may fail to document assets with proper metadata, links may break when agencies reorganize their websites, and data may be removed from agency servers without any corresponding update to the catalog.
This creates a core policy question the report poses for Congress: Should Data.gov remain a registry that points users elsewhere, or should it become a repository that actually hosts the underlying data? The current statutory framework permits either approach, but in practice, the site functions as the former with all the fragility that entails.
Political Stakes
For the Administration
The Trump Administration's record on public data access is directly implicated. The report notes that OMB Circular A-130, a foundational document governing federal information resources management, "is not currently posted on the OMB website of the second Trump Administration." The circular provides the policy framework agencies use to manage information. Its absence from the OMB website raises questions about whether the current administration considers it operative.
More broadly, the administration's workforce reductions and agency reorganizations create practical risks for Data.gov implementation. Agencies are required to complete various data asset assessments by September 30, 2026, a deadline that may be difficult to meet given reduced capacity across the federal government.
For Congress
The report is, at its core, a congressional oversight document. It identifies multiple pressure points where Congress could act: requiring OMB to produce the overdue compliance reports, clarifying whether Data.gov should function as a registry or repository, setting standards for metadata quality, and legislating persistence requirements so that publicly available datasets cannot be quietly removed without notice.
A bill introduced in 2017 (the Preserving Data in Government Act) would have required agencies to provide advance public notice before removing publicly available data assets. It never received a floor vote. The current environment may give that idea renewed relevance.
For the Public
The stakes for ordinary users, researchers, journalists, and state and local governments are concrete. Federal data underpins public health research, economic analysis, environmental monitoring, and countless other functions. When datasets disappear or become unreliable, the downstream effects ripple broadly. The report notes that organizations, including the Association of Health Care Journalists and the Data Rescue Project, have begun independently archiving federal data out of concern that government sources can no longer be counted on for persistence.
The Bottom Line
Congress passed a law requiring open government data, and the executive branch has been slow to implement it and may now be actively undermining it. The CRS report gives lawmakers a detailed roadmap of where the gaps are, from definitional ambiguities that let agencies exclude data, to a compliance reporting requirement that has gone unmet for years, to a catalog that cannot reliably tell users whether federal data actually exists.
Access the Legis1 platform for comprehensive political news, data, and insights.
