As we await the arrival of a new wave of filings in June 2011 we thought we’d check out the library of variables that will be available. In addition to US-GAAP data elements, filers are allowed to add “extensions” to their submittals. These extensions are independently defined by each company and do not require external coordination or rationalization at this time. The way extensions are added to a filing is via the xsd file, an XBRL definitions file that starts with
We instructed our computer to count up the population of 10-K, 10-Q and 20-F filings for the past year and rummage through the xsd’s. On the particular test run we did there were 36,136 SEC filings meeting our test run criteria in the Accessions tape. Of these 3,880 included XBRL exhibit attachments representing 1,488 Central Index Key (CIK) SEC Registrants. This was roughly one fifth of the population expected to start submitting in June 2011.
Of the 3,880 filings, we found that 3,039 (78%) contained extension elements. The remainder only used primary taxonomies in their construction. The number of extensions in these documents ranged from a low of one extension to a high of nine hundred twenty-two extensions in a single filing. The total number of extension elements created by the sampled filings was 183,846.
Extrapolating this to an estimated 8,100 companies submitting beginning in June 2011 (8,100/1,488 = 5.44X) says that we are looking at an annual extension library to keep track of just over one million independently created elements in addition to the US-GAAP set. Wowie!
This actually doesn’t bother us that much. We have expected for a long time now that the extensions work around for the taxonomy architects not being able to anticipate every possible data element would create this type of algal bloom in the data. It merely points out two things.
First, unless one is doing merger and acquisition work, one can probably ignore most of these extensions and do most first and second level screening analytics just using the US-GAAP subset. This replicates – if not surpasses – the detail coming out of the best of the fundamental feeds. Besides if you are doing M&A diligence, you are looking at more than just financial reporting filings anyway.
There’s always been the notion among analysts that the first true use of XBRL filings exhibits would be based on subset analytics as opposed to extreme diligence tracing of every nuance item in a filing. From what we see, the June 2011 filings population should provide the first near-census test opportunity to do aggregate and sector analysis on U.S. public companies where the data goes directly from SEC’s EDGAR system to the research department without needing to pass through an intermediary processor. And it is chain of process traceable to the government evidentiary source. We like that!
Second, from our cursory inspection we believe many of these company created extensions are undoubtedly common in nature. They can – with proper effort - be aligned into new standardized taxonomy elements over time. As these winnow down, we expect what remains will be the types of specifics that are truly company unique. Still, seeing a million data elements to catalog is once again a good lesson to all that in data management for information to be usable, less is more.