Skip to content

Release for data update 03/31/2020

Compare
Choose a tag to compare
released this 10 Jun 15:14

Bulk Download Changes

  • Line Breaks retained in text data:
    • Claims: all text from 2001 and later will have the line breaks in the text
    • Brief Summary Text:
      * Data from 2020 and later will have the line breaks retained in the text.
      * Line breaks for older data will get included when the first opportunity to reparse older data arises.
    • Detailed Description Text:
      * Data from 2020 and later will have the line breaks retained in the text.
      * Line breaks for older data will get included when the first opportunity to reparse older data arises.
    • Draw Description Text: Line breaks are not included at this time.
  • Location ID added to patent_assignee and patent_inventor
    • Previously to identify the location of a patent by the way of the assignee, patent_assignee needed to be joined with location_assignee and then with the location table. A similar join was needed for the patent inventor. To reduce the complexity, patent_assignee and patent_inventor tables will carry an additional field: location_id. This field will map to the id field from the location table. This makes the data in location_assignee and location_inventor redundant. Future releases will not carry these two tables.
  • Read In Scripts:
    • Example Python & R scripts that demonstrate reading each bulk download file will be available here: Read In Scripts This is a work in progress and will be updated over time.
  • Planned changes after 2020.03.22v1 release (Documentation and details will be added with the release)
    • Claims:
      • Remove duplicates in some of the claims yearly files where the first set of records (about 300K) are duplicated.
      • Remove NULL text data in some of the claims files.
      • Recode NUM field and add documentation.
      • Recode Exemplary field (replacing TRUE/FALSE with 0/1)
      • Re-order header to be consistent with data dictionary
    • Brief Summary Text:
      • Break files into yearly files
    • Draw Description Text:
      • Break files into yearly files
      • Include line breaks in the text
Table File(s) Data Contains Line Break Field Separator Quote Settings Quote Character
claims Yearly files from 1976 - 2005 No \t Non Numeric Fields Quoted "
claims Yearly files from 2005 - 2020 Yes \t Non Numeric Fields Quoted "
brf_sum_text Single bulk file Yes \t Non Numeric Fields Quoted "
detail_desc_text 2020 data file Yes \t Non Numeric Fields Quoted "
detail_desc_text 2019 data file No \t Non Numeric Fields Quoted "
detail_desc_text Yearly files from 1976 - 2018 No \t Unquoted N/A
draw_desc_text Single bulk file No \t Non Numeric Fields Quoted "
all other tables Single bulk file No \t Non Numeric Fields Quoted "