Release for data update 03/31/2020
Bulk Download Changes
- Line Breaks retained in text data:
- Claims: all text from 2001 and later will have the line breaks in the text
- Brief Summary Text:
* Data from 2020 and later will have the line breaks retained in the text.
* Line breaks for older data will get included when the first opportunity to reparse older data arises. - Detailed Description Text:
* Data from 2020 and later will have the line breaks retained in the text.
* Line breaks for older data will get included when the first opportunity to reparse older data arises. - Draw Description Text: Line breaks are not included at this time.
- Location ID added to patent_assignee and patent_inventor
- Previously to identify the location of a patent by the way of the assignee, patent_assignee needed to be joined with location_assignee and then with the location table. A similar join was needed for the patent inventor. To reduce the complexity, patent_assignee and patent_inventor tables will carry an additional field: location_id. This field will map to the id field from the location table. This makes the data in location_assignee and location_inventor redundant. Future releases will not carry these two tables.
- Read In Scripts:
- Example Python & R scripts that demonstrate reading each bulk download file will be available here: Read In Scripts This is a work in progress and will be updated over time.
- Planned changes after 2020.03.22v1 release (Documentation and details will be added with the release)
- Claims:
- Remove duplicates in some of the claims yearly files where the first set of records (about 300K) are duplicated.
- Remove NULL text data in some of the claims files.
- Recode NUM field and add documentation.
- Recode Exemplary field (replacing TRUE/FALSE with 0/1)
- Re-order header to be consistent with data dictionary
- Brief Summary Text:
- Break files into yearly files
- Draw Description Text:
- Break files into yearly files
- Include line breaks in the text
- Claims:
Table | File(s) | Data Contains Line Break | Field Separator | Quote Settings | Quote Character |
---|---|---|---|---|---|
claims | Yearly files from 1976 - 2005 | No | \t | Non Numeric Fields Quoted | " |
claims | Yearly files from 2005 - 2020 | Yes | \t | Non Numeric Fields Quoted | " |
brf_sum_text | Single bulk file | Yes | \t | Non Numeric Fields Quoted | " |
detail_desc_text | 2020 data file | Yes | \t | Non Numeric Fields Quoted | " |
detail_desc_text | 2019 data file | No | \t | Non Numeric Fields Quoted | " |
detail_desc_text | Yearly files from 1976 - 2018 | No | \t | Unquoted | N/A |
draw_desc_text | Single bulk file | No | \t | Non Numeric Fields Quoted | " |
all other tables | Single bulk file | No | \t | Non Numeric Fields Quoted | " |