Journal Home > Volume 18 , Issue 4

Patents are critically important for a company to protect its core business concepts and proprietary technologies. Effective patent mining in massive patent databases not only provides business enterprises with valuable insights to develop strategies for research and development, intellectual property management, and product marketing, but also helps patent offices to improve efficiency and optimize their patent examination processes. This paper describes the patent mining problem of automatically discovering core patents (i.e., novel and influential patents in a domain). In addition, the value of core patent mining is illustrated by revealing the potential competitive relationships among companies in their core patents. The work addresses the unique patent vocabulary usage which is not considered in traditional word-based statistical methods with a topic-based temporal mining approach that quantifies a patent’s novelty and influence through topic activeness variations. Tests of this method on real-world patent portfolios show the effectiveness of this approach over state-of-the-art methods.


menu
Abstract
Full text
Outline
About this article

Finding Nuggets in Patent Portfolios: Core Patent Mining and Its Applications

Show Author's information Po HuMinlie Huang( )Xiaoyan Zhu
Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China

Abstract

Patents are critically important for a company to protect its core business concepts and proprietary technologies. Effective patent mining in massive patent databases not only provides business enterprises with valuable insights to develop strategies for research and development, intellectual property management, and product marketing, but also helps patent offices to improve efficiency and optimize their patent examination processes. This paper describes the patent mining problem of automatically discovering core patents (i.e., novel and influential patents in a domain). In addition, the value of core patent mining is illustrated by revealing the potential competitive relationships among companies in their core patents. The work addresses the unique patent vocabulary usage which is not considered in traditional word-based statistical methods with a topic-based temporal mining approach that quantifies a patent’s novelty and influence through topic activeness variations. Tests of this method on real-world patent portfolios show the effectiveness of this approach over state-of-the-art methods.

Keywords: text mining, core patent, patent novelty, patent influence, company competitor

References(41)

[1]
D. Kappos, Innovation Policy and the Economy. Cambridge, MA, USA: National Bureau of Economic Research, 2010.
[2]
Institute for Prospective Technological Studies, The 2011 EU Industrial R&D Investment Scoreboard. Brussels, Belgium: European Commission’s Joint Research Centre, 2011.
[3]
K. Edward, Patent mining in a changing world of technology and product development, Intellectual Asset Management, pp. 7-10, July/Aug. 2003.
[4]
T. Keraan, Extracting Maximum Value from Intellectual Assets. New York City, USA: Deloitte & Touche, 2010.
[5]
USPTO, Fiscal year 2011 performance and accountability report, http://www.uspto.gov/about/ stratplan/ar/2011/index.jsp, 2011.
[6]
Google Patents, http://www.google.com/patents, 2013.
[7]
Delphion, http://www.delphion.com, 2013.
[8]
[9]
[10]
Y. Liu, P. Hseuh, R. Lawrence, S. Meliksetian, C. Perlich, and A. Veen, Latent graphical models for quantifying and predicting patent quality, in Proc. of 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2011, pp. 1145-1153.
DOI
[11]
X. Jin, S. Spangler, Y. Chen, K. Cai, R. Ma, L. Zhang, X. Wu, and J. Han, Patent maintenance recommendation with patent information network model, in Proc. of 11th IEEE International Conference on Data Mining, 2011, pp. 280-289.
DOI
[12]
B. Wang, M. Chu, and J. Z. Shyu, Patent value measurement by analytic hierarchy process, in Proc. of 15th International Conference on Management of Technology, 2006, pp. 1-12.
[13]
R. J. Mann and M. Underweiser, A new look at patent quality: Relating patent prosecution to validity, Journal of Empirical Legal Studies, vol. 9, no. 1, pp. 1-32, 2012.
[14]
Y. Guo and C. Gomes, Ranking structured documents: A large margin based approach for patent prior art search, in Proc. of 21st International Joint Conference on Artificial Intelligence, 2009, pp. 1058-1064.
[15]
X. Xue and W. B. Croft, Transforming patents into prior-art queries, in Proc. of 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2009, pp. 808-809.
DOI
[16]
L. Azzopardi, W. Vanderbauwhede, and H. Joho, Search system requirements of patent analysts, in Proc. of 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2010, pp. 775-776.
DOI
[17]
[18]
[19]
[20]
[21]
M. A. Hasan, W. S. Spangler, T. Griffin, and A. Alba, COA: Finding novel patents through text analysis, in Proc. of 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2009, pp. 1175-1184.
DOI
[22]
Y. Chen, S. Spangler, J. Kreulen, X. Wu, and L. Zhang, SIMPLE: A strategic information mining platform for licensing and execution, in Proc. of 9th IEEE International Conference on Data Mining Workshops, 2009, pp. 270-275.
DOI
[23]
B. Shaparenko, R. Caruana, J. Gehrke, and T. Joachims, Identifying temporal patterns and key players in document collections, in Proc. of 5th IEEE International Conference on Data Mining Workshops, 2005, pp. 165-174.
[24]
S. M. Gerrish and D. M. Blei, A language-based approach to measuring scholarly impact, in Proc. of 27th International Conference on Machine Learning, 2010, pp. 375-382.
[25]
B. Shaparenko and T. Joachims, Information genealogy: Uncovering the flow of ideas in non-hyperlinked document databases, in Proc. of 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2007, pp. 619-628.
DOI
[26]
S. Bao, R. Li, Y. Yu, and Y. Cao, Competitor mining with the web, IEEE Transactions on Knowledge and Data Engineering, vol. 20, no. 10, pp. 1297-1310, 2008.
[27]
Z. Ma, G. Pant, and O. R. L. Sheng, Mining competitor relationships from online news: A network-based approach, Electronic Commerce Research and Applications, vol. 10, no. 4, pp. 418-427, 2011.
[28]
Y. Li, L. Wang, and C. Hong, Extracting the significant-rare keywords for patent analysis, Expert Systems with Applications, vol. 36, no. 3, pp. 5200-5204, 2009.
[29]
T. Chen, Patent claim construction: An appeal for Chevron deference, Virginia Law Review, vol. 94, no. 5, pp. 1165-1212, 2008.
[30]
A. Kotov, C. Zhai, and R. Sproat, Mining named entities with temporally correlated bursts from multilingual web news streams, in Proc. of 4th ACM International Conference on Web Search and Data Mining, 2011, pp. 237-246.
DOI
[31]
D. Newman, A. Asuncion, P. Smyth, and M. Welling, Distributed algorithms for topic models, Journal of Machine Learning Research, vol. 10, pp. 1801-1828, 2009.
[32]
W. Fischer and K. Meier-Hellstern, The Markov-modulated Poisson process cookbook, Performance Evaluation, vol. 18, no. 2, pp. 149-171, 1993.
[33]
A. P. Dempster, N. M. Laird, and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, vol. 39, no. 1, pp. 1-38, 1977.
[34]
[35]
USPTO database, http://patft.uspto.gov, 2013.
[36]
A. K. McCallum, MALLET: A machine learning for language toolkit, http://mallet.cs.umass.edu, 2002.
[37]
The official USPTO gazettes, http://www.uspto.gov/ news/og/index.jsp, 2013.
[38]
C. Lin and E. Hovy, Automatic evaluation of summaries using N-gram co-occurrence statistics, in Proc. of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, 2003, pp. 71-78.
DOI
[39]
Yahoo! Finance, http://finance.yahoo.com/q/co?s=MSFT, 2013.
[40]
D. M. Blei and J. D. Lafferty, Dynamic topic models, in Proc. of 23rd International Conference on Machine Learning, 2006, pp. 113-120.
DOI
[41]
X. Wang and A. McCallum, Topics over time: A non-Markov continuous-time model of topical trends, in Proc. of 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2006, pp. 424-433.
DOI
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 27 May 2013
Accepted: 17 June 2013
Published: 05 August 2013
Issue date: August 2013

Copyright

© The author(s) 2013

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No. 61272227), the National Key Technology Research and Development Program (No. 20121857860), and the Tsinghua University Initiative Scientific Research Program (No. 20121088071). We thank the three anonymous reviewers for their valuable comments.

Rights and permissions

Return