Finding Nuggets in Patent Portfolios: Core Patent Mining and Its Applications

Po Hu; Minlie Huang; Xiaoyan Zhu

doi:10.1109/TST.2013.6574672

Tsinghua Science and Technology 2013, 18(4): 339-352 https://doi.org/10.1109/TST.2013.6574672

Open Access | Issue | Published: 05 August 2013

Finding Nuggets in Patent Portfolios: Core Patent Mining and Its Applications

Show Author's Information Hide Author's Information Po Hu, Minlie Huang(

), Xiaoyan Zhu

Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China

Keywords:

text mining, core patent, patent novelty, patent influence, company competitor

Cite this article:

Hu P, Huang M, Zhu X. Finding Nuggets in Patent Portfolios: Core Patent Mining and Its Applications. Tsinghua Science and Technology, 2013, 18(4): 339-352. https://doi.org/10.1109/TST.2013.6574672

Download citation

EndNote(RIS)

BibTeX

355

Views

Downloads

Citations

Crossref

N/A

WoS

Scopus

CSCD

Abstract Full text About this article

Abstract

Patents are critically important for a company to protect its core business concepts and proprietary technologies. Effective patent mining in massive patent databases not only provides business enterprises with valuable insights to develop strategies for research and development, intellectual property management, and product marketing, but also helps patent offices to improve efficiency and optimize their patent examination processes. This paper describes the patent mining problem of automatically discovering core patents (i.e., novel and influential patents in a domain). In addition, the value of core patent mining is illustrated by revealing the potential competitive relationships among companies in their core patents. The work addresses the unique patent vocabulary usage which is not considered in traditional word-based statistical methods with a topic-based temporal mining approach that quantifies a patent’s novelty and influence through topic activeness variations. Tests of this method on real-world patent portfolios show the effectiveness of this approach over state-of-the-art methods.

Full text

Abstract

Full text

Outline

About this article

Finding Nuggets in Patent Portfolios: Core Patent Mining and Its Applications

Show Author's information Hide Author's Information Po Hu, Minlie Huang(

), Xiaoyan Zhu

Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China

Abstract

Keywords: text mining, core patent, patent novelty, patent influence, company competitor

References(41)

[1]

D. Kappos, Innovation Policy and the Economy. Cambridge, MA, USA: National Bureau of Economic Research, 2010.

[2]

Institute for Prospective Technological Studies, The 2011 EU Industrial R&D Investment Scoreboard. Brussels, Belgium: European Commission’s Joint Research Centre, 2011.

[3]

K. Edward, Patent mining in a changing world of technology and product development, Intellectual Asset Management, pp. 7-10, July/Aug. 2003.

[4]

T. Keraan, Extracting Maximum Value from Intellectual Assets. New York City, USA: Deloitte & Touche, 2010.

[5]

USPTO, Fiscal year 2011 performance and accountability report, http://www.uspto.gov/about/ stratplan/ar/2011/index.jsp, 2011.

[6]

Google Patents, http://www.google.com/patents, 2013.

[7]

Delphion, http://www.delphion.com, 2013.

[8]

IPVision, http://www.see-the-forest.com/G4/Main.act, 2013.

[9]

Aureka, http://aureka.micropat.com, 2013.

[10]

Y. Liu, P. Hseuh, R. Lawrence, S. Meliksetian, C. Perlich, and A. Veen, Latent graphical models for quantifying and predicting patent quality, in Proc. of 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2011, pp. 1145-1153.

DOI

[11]

X. Jin, S. Spangler, Y. Chen, K. Cai, R. Ma, L. Zhang, X. Wu, and J. Han, Patent maintenance recommendation with patent information network model, in Proc. of 11th IEEE International Conference on Data Mining, 2011, pp. 280-289.

DOI

[12]

B. Wang, M. Chu, and J. Z. Shyu, Patent value measurement by analytic hierarchy process, in Proc. of 15th International Conference on Management of Technology, 2006, pp. 1-12.

[13]

R. J. Mann and M. Underweiser, A new look at patent quality: Relating patent prosecution to validity, Journal of Empirical Legal Studies, vol. 9, no. 1, pp. 1-32, 2012.

DOI Google Scholar

[14]

Y. Guo and C. Gomes, Ranking structured documents: A large margin based approach for patent prior art search, in Proc. of 21st International Joint Conference on Artificial Intelligence, 2009, pp. 1058-1064.

[15]

X. Xue and W. B. Croft, Transforming patents into prior-art queries, in Proc. of 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2009, pp. 808-809.

DOI

[16]

L. Azzopardi, W. Vanderbauwhede, and H. Joho, Search system requirements of patent analysts, in Proc. of 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2010, pp. 775-776.

DOI

[17]

NTCIR, http://research.nii.ac.jp/ntcir/index-en.html, 2013.

[18]

CLEF-IP, http://www.ir-facility.org/clef-ip, 2013.

[19]

PaIR, http://www.ifs.tuwien.ac.at/pair2011/Site/PaIR11.html, 2013.

[20]

TREC-CHEM, http://www.ir-facility.org/trec-chem, 2013.

[21]

M. A. Hasan, W. S. Spangler, T. Griffin, and A. Alba, COA: Finding novel patents through text analysis, in Proc. of 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2009, pp. 1175-1184.

DOI

[22]

Y. Chen, S. Spangler, J. Kreulen, X. Wu, and L. Zhang, SIMPLE: A strategic information mining platform for licensing and execution, in Proc. of 9th IEEE International Conference on Data Mining Workshops, 2009, pp. 270-275.

DOI

[23]

B. Shaparenko, R. Caruana, J. Gehrke, and T. Joachims, Identifying temporal patterns and key players in document collections, in Proc. of 5th IEEE International Conference on Data Mining Workshops, 2005, pp. 165-174.

[24]

S. M. Gerrish and D. M. Blei, A language-based approach to measuring scholarly impact, in Proc. of 27th International Conference on Machine Learning, 2010, pp. 375-382.

[25]

B. Shaparenko and T. Joachims, Information genealogy: Uncovering the flow of ideas in non-hyperlinked document databases, in Proc. of 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2007, pp. 619-628.

DOI

[26]

S. Bao, R. Li, Y. Yu, and Y. Cao, Competitor mining with the web, IEEE Transactions on Knowledge and Data Engineering, vol. 20, no. 10, pp. 1297-1310, 2008.

DOI Google Scholar

[27]

Z. Ma, G. Pant, and O. R. L. Sheng, Mining competitor relationships from online news: A network-based approach, Electronic Commerce Research and Applications, vol. 10, no. 4, pp. 418-427, 2011.

DOI Google Scholar

[28]

Y. Li, L. Wang, and C. Hong, Extracting the significant-rare keywords for patent analysis, Expert Systems with Applications, vol. 36, no. 3, pp. 5200-5204, 2009.

DOI Google Scholar

[29]

T. Chen, Patent claim construction: An appeal for Chevron deference, Virginia Law Review, vol. 94, no. 5, pp. 1165-1212, 2008.

Google Scholar

[30]

A. Kotov, C. Zhai, and R. Sproat, Mining named entities with temporally correlated bursts from multilingual web news streams, in Proc. of 4th ACM International Conference on Web Search and Data Mining, 2011, pp. 237-246.

DOI

[31]

D. Newman, A. Asuncion, P. Smyth, and M. Welling, Distributed algorithms for topic models, Journal of Machine Learning Research, vol. 10, pp. 1801-1828, 2009.

Google Scholar

[32]

W. Fischer and K. Meier-Hellstern, The Markov-modulated Poisson process cookbook, Performance Evaluation, vol. 18, no. 2, pp. 149-171, 1993.

DOI Google Scholar

[33]

A. P. Dempster, N. M. Laird, and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, vol. 39, no. 1, pp. 1-38, 1977.

DOI Google Scholar

[34]

Large petroleum companies listed by Wikipedia, http://en.wikipedia.org/wiki/List_of_oil_exploration_and_ production_companies and http://en.wikipedia.org/wiki/List_of_oilfield_service_companies, 2013.

[35]

USPTO database, http://patft.uspto.gov, 2013.

[36]

A. K. McCallum, MALLET: A machine learning for language toolkit, http://mallet.cs.umass.edu, 2002.

[37]

The official USPTO gazettes, http://www.uspto.gov/ news/og/index.jsp, 2013.

[38]

C. Lin and E. Hovy, Automatic evaluation of summaries using N-gram co-occurrence statistics, in Proc. of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, 2003, pp. 71-78.

DOI

[39]

Yahoo! Finance, http://finance.yahoo.com/q/co?s=MSFT, 2013.

[40]

D. M. Blei and J. D. Lafferty, Dynamic topic models, in Proc. of 23rd International Conference on Machine Learning, 2006, pp. 113-120.

DOI

[41]

X. Wang and A. McCallum, Topics over time: A non-Markov continuous-time model of topical trends, in Proc. of 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2006, pp. 424-433.

DOI

About this article

Publication history

Acknowledgements

Rights and permissions

Publication history

Received: 27 May 2013

Accepted: 17 June 2013

Published: 05 August 2013

Issue date: August 2013

Copyright

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No. 61272227), the National Key Technology Research and Development Program (No. 20121857860), and the Tsinghua University Initiative Scientific Research Program (No. 20121088071). We thank the three anonymous reviewers for their valuable comments.