What is the difference between IP DataDirect and PatentSight+™ Bulk Data?
What is the difference between the Global Patent Litigation Dataset and PatentAdvisor® API?
Which dataset should I use for building my own patent database?
Which dataset is best for patent portfolio analysis and benchmarking?
Which dataset is right for data science or machine learning projects?
Overview
LexisNexis® offers a range of intellectual property data products designed to support different use cases across the patent lifecycle from raw patent data ingestion to advanced analytics, litigation intelligence, and prosecution strategy.
This guide helps you determine which dataset is right for your needs, based on the type of data you require, how you want to use it, and how you plan to integrate it into your systems.
Choosing the right dataset depends on your data needs, use cases, and integration requirements. If you want to learn more and discuss your specific needs after reviewing the FAQs below, please complete this form and a member of the team will reach out to you.
FAQs
What is the difference between IP DataDirect and PatentSight+™ Bulk Data?
The key difference is the level of data and intended use:
IP DataDirect provides document-level patent data, including global patent publications, legal status, and metadata. Best for building internal patent databases or search systems.
PatentSight+ Bulk Data provides patent family–level analytics data, including ownership, technology classifications, and proprietary metrics. Best for portfolio analysis, benchmarking, and strategic insights.
If you need raw patent data, choose IP DataDirect.
If you need analytics-ready patent intelligence, choose PatentSight+ Bulk Data.
When should I use the Global Patent Litigation Dataset?
You should use the Global Patent Litigation Dataset (GPLD) when you need visibility into patent disputes and litigation risk. It is best suited for:
Monitoring litigation activity across jurisdictions
Assessing risk in M&A or patent acquisitions
Tracking competitor enforcement strategies
Analyzing litigation outcomes and trends
Choose GPLD when your focus is litigation, disputes, and risk analysis.
What is the difference between the Global Patent Litigation Dataset (GPLD) and PatentAdvisor® API?
These datasets cover different stages of the patent lifecycle:
GPLD focuses on litigation (what happens after a patent is enforced or challenged)
PatentAdvisor API focuses on prosecution (what happens during patent examination)
Use GPLD to analyze litigation risk and disputes.
Use PatentAdvisor API to optimize filing strategy and predict examiner behavior.
Many organizations use both together to understand end-to-end IP risk and strategy.
Which dataset should I use for building my own patent database?
Use IP DataDirect. It provides:
Global patent publications
Full-text documents
Legal status data
Standardized XML formats
It is specifically designed for data ingestion, storage, and internal system integration.
Which dataset is best for patent portfolio analysis and benchmarking?
Use PatentSight+ Bulk Data. It includes:
Patent family–level data
Corporate ownership mapping
Technology clustering
Metrics such as Patent Asset Index
It is designed for analytics, benchmarking, and strategic decision-making.
Can I combine multiple datasets?
Yes. Many organizations use multiple datasets together. For example:
IP DataDirect to build a patent data foundation
PatentSight+ Bulk Data to perform analytics and benchmarking
GPLD to assess litigation risk
PatentAdvisor API to optimize prosecution strategy
Combining datasets allows organizations to build a complete view of the patent lifecycle, from filing through to litigation.
Which dataset is right for data science or machine learning projects?
It depends on your objective:
Use IP DataDirect if you need large-scale raw patent data for NLP, text mining, or custom models
Use PatentSight+ Bulk Data if you need structured, analytics-ready data with metrics
Use GPLD if you want to model litigation risk or outcomes
Use PatentAdvisor API if you want to model prosecution success or examiner behavior
How do I decide which dataset is right for my organization?
Ask yourself the following:
1. What type of data do I need?
Raw patent documents → IP DataDirect
Patent analytics → PatentSight+ Bulk Data
Litigation data → GPLD
Prosecution data → PatentAdvisor API
2. What is my primary use case?
Build internal patent systems → IP DataDirect
Portfolio benchmarking → PatentSight+ Bulk Data
Litigation risk analysis → GPLD
Filing strategy optimization → PatentAdvisor API
3. How do I want to use the data?
Store and manage data internally → IP DataDirect
Analyze and visualize insights → PatentSight+ Bulk Data
Integrate litigation intelligence → GPLD
Integrate prosecution analytics → PatentAdvisor API
Who typically uses these datasets?
These datasets are commonly used by:
IP and R&D professionals
Legal and compliance teams
Financial analysts and investment teams
Enterprise data and analytics teams
Data scientists and engineers
Where can I learn more about each dataset?
You can explore detailed articles for each product:
What is LexisNexis® IP DataDirect?
What is LexisNexis® PatentSight+™ Bulk Data?
What is the Global Patent Litigation Dataset?
What is the PatentAdvisor® API?
Summary
LexisNexis IP data products are designed to support different stages of the patent lifecycle and different types of analysis.
IP DataDirect > raw patent data
PatentSight+ Bulk Data > patent analytics
Global Patent Litigation Dataset > litigation intelligence
PatentAdvisor API > prosecution analytics
Choosing the right dataset depends on your data needs, use cases, and integration requirements. If you want to learn more and discuss your specific needs after reviewing these FAQs, please complete this form and a member of the team will reach out to you.