- The problem of “over-declaration”
- The problem of “submarine SEPs” and “blanket declaration”
- IPlytics data and approach to solving the “over-declaration”, the “blanket-declaration” and the “submarine SEP” problem.
- Measuring the likelihood of a patent’s essentiality
I. The problem of “over-declaration”
Standard setting organizations (SSOs) such as ETSI, IEEE, ITU-T, ISO, IEC and others maintain databases of declared standard essential patents (SEP). Patent owners disclose and list patents that are, in their own estimation, potentially essential for a given standard. The database of patent declarations therefore includes self-declared patents. The patent owner does not submit any evidence, as as claims charts, for instance, that would prove essentiality. Nor does the SSO or any third party conduct any essentiality tests that are made public in any declaration database. SSOs often encourage a timely declaration of even pending applications, where claims may still change upon granting. The purpose of patent declaration databases is not to provide an accurate list of verified SEPs but to ensure that any potentially essential patent is covered by the FRAND (Fair, Reasonable and Non-Discriminatory) commitment. Due to the nature and purpose of declaration databases only a small share of listed patents is essential to the standard. Studies estimate that approximately only 10-30% of all declared and granted patents are essential. Patent portfolio analysis of declared patents always then includes patents that are not mappable to the standard. This is especially problematic as the essentiality rate of declared patents differs across patent portfolios as some companies more thoroughly evaluate the patents, they declare than others. The so-called “over-declaration” problem makes it difficult to understand the actual number of verified SEPs a company owns for a given standard.
II. The problem of “submarine SEPs” and “blanket declaration”
Some Standard Setting Organizations (SSOs) such as IEEE (Wi-Fi generations) or ITU-T (AVC, HEVC, VVC) and others allow the submission of so called “blanket declaration”. Blanket declarations are statements by a company that owns patents that are potentially essential for a given standard without listing the specific patent numbers. Typically, the declaring company commits to license any patent in its portfolio essential to the standard under FRAND. Legally this commitment is more far reaching than specific declarations as the commitments say that any SEP the company owns will be licensed under FRAND as it does not depend on single patent numbers being declared. However, one blanket declaration by a company may include only a few SEPs, hundreds of SEPs or thousands of SEPs and there is no transparency about the actual portfolio size. The so-called “blanket-declaration” problem makes it difficult to understand the actual number of verified SEPs a company owns for a given standard. In addition, declaring specific patents is a challenge for the patent owner as it is time consuming to crawl through a large portfolio to identify all potential SEPs. Some companies therefore do not declare all their patents and so called “submarine SEPs” are not disclosed.
III. IPlytics data and approach to solving the “over-declaration”, the “blanket-declaration” and the “submarine SEP” problem.
IPlytics gathers data from standard setting organizations such as meeting minutes, working group participation, standards contributions, standard meeting attendance as well as data from worldwide submissions of declared SEPs. In this regard IPlytics maintains and indexes a database with over 120 million worldwide filed patents, 4 million worldwide ratified standards documents, 1,5 million standards contributions and meeting minutes and over 300,000 declared standard essential patents (see figure 1). IPlytics compares inventor names, applicant/assignee names, CPC/IPC classifications as well as patent and nonpatent citations with the full text standard and contribution data. Furthermore, standards’ sections are semantically compared to patent claims. All patent documents in the IPlytics database are full text indexed (title, abstract, claims and descriptions) and if not originally in English then machine translated to English.
IV. Measuring the likelihood of a patent’s essentiality
1. Linguistic standard section to patent claim analysis
IPlytics works with technical experts that have many years’ experience in the technology space. These experts are either members of the 3GPP/IEEE/HEVC/VVC working groups, patent attorneys (US or EP) or technical consultants with an engineering background in the specific domain. The selected experts conduct a linguistic analysis of the standard section and compare the technical description with selected patent claims on a given sample of claim chart analysis (figure 2). A training dataset of manually created claim charts functions as the input parameters for the semantic LSI algorithm. Here independent claims are semantically compared to standards sections.
2. Latent Semantic Indexing
IPlytics makes use of semantic similarity comparisons of patent claims and standard TS sections. TS documents are often split into hundreds of sections to ensure the algorithms isolates the claim language to compare to the TS sections only. The semantic LSI model is trained with the linguistic input of experts as the language of patent attorneys drafting claims and 3GPP engineers writing specifications differs. A training data set of actual claim charts provides linguistic examples of essential and non-essential patent claims and technical specification section combinations. The model used for this approach is the LSI model following a 5-step approach:
1. We use the different sections of each standard to compare against worldwide patent claims for the textual input of the similarity analysis.
2. We create a word vector matrix (term-document matrix) where each row corresponds to a term (of the documents of interest), and each column corresponds to a document. Each element (m,n) in the matrix corresponds to the frequency that the term m occurs in document n. We apply Log Entropy for local and global term weighting.
3. Singular value composition (SVD) is used to reduce this matrix to a product of three matrices, one of which has non-zero values (the singular values) only on the diagonal.
4. Dimensionality is reduced by deleting all but the k largest values on this diagonal, together with the corresponding columns in the other two matrices. This truncation process is used to generate a k-dimensional vector space. Both terms and documents are represented by kdimensional vectors in this vector space.
5. The relatedness of any two objects represented in the space is reflected by the proximity of their representation vectors, in our case a cosine measure. The LSI model produces semantic similarity scores for each patent claim and technical specification section combination. Scores are presented in percentages of similarities. If a declaration does not declare a TS version, the latest TS version is used. For each patent standard combination, the algorithm identifies the claim section combination with the highest semantic overlap.