中国综合性科技类核心期刊(北大核心)

中国科学引文数据库来源期刊(CSCD)

美国《化学文摘》(CA)收录

美国《数学评论》(MR)收录

俄罗斯《文摘杂志》收录

Message Board

Respected readers, authors and reviewers, you can add comments to this page on any questions about the contribution, review, editing and publication of this journal. We will give you an answer as soon as possible. Thank you for your support!

Name
E-mail
Phone
Title
Content
Verification Code
Issue 6
Jan.  2014
Turn off MathJax
Article Contents
XIE Hao, JIANG Hong. Improved LDA model for microblog topic mining[J]. Journal of East China Normal University (Natural Sciences), 2013, (6): 93-101.
Citation: XIE Hao, JIANG Hong. Improved LDA model for microblog topic mining[J]. Journal of East China Normal University (Natural Sciences), 2013, (6): 93-101.

Improved LDA model for microblog topic mining

  • Received Date: 2012-11-01
  • Rev Recd Date: 2013-02-01
  • Publish Date: 2013-11-25
  • With the dramatic increase of Sina microblog users, microblog websites have been the platformsfor a wide spectrum of users to get information. Due to the fact that microblog is a special kind of text with the restricted length, traditional topic models could not be used to analyze the microblog content very well. RT-LDA, a microblog generation model based on LDA is proposed in this paper. Gibbs sampling is chosen to deduce the model, which can not only mine the topics of each microblog accurately but also induce the distribution of the concerned topics. RT-LDAs effective utility on topic mining of the microblogs is verified by the experiments on real data.
  • loading
  • [1]
    [1] ZHAO W X, HE J, YAN H F, et al. Comparing Twitter and traditional media using topic models[J]. Advances in Information Retrieval, Proceedings. 2011, 6611:338-349.

    [2] NOORDHUIS P, HEIJKOOP M, LAZOVIK A. Mining Twitter in the cloud: a case study[C]. Cloud Computing (CLOUD), 2010 IEEE 3rd International Conference. 2010 July, 107-114.

    [3] KANG J H, LERMAN K, PLANGPRASOPCHOK A. Analyzing microblogs with affinity propagation [C]//Proc of the 1st KDD Workshop on Social Media Analytic. New York: ACM, 2010: 67-70.

    [4] BLEI D M, NG A Y, JORDAN M I. Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003, 3:993-1022.

    [5] 张晨逸, 孙建伶, 丁轶群. 基于MB-LDA模型的微博主题挖掘[J]. 计算机研究与发展,2011, 48(10): 1795-1802.

    [6] RAMAGE D, DUMAIS S, LIEBLING D. Characterizing microblogs with topic models[C]. ICWSM, 2010:130-137.

    [7] 廉捷, 周欣, 曹伟, 刘云. 新浪微博数据挖掘方案[J]. 清华大学学报:自然科学版,2011 51(10): 1300-1305. 

    [8] ZHANG H P, YU H K, XIONG D Y, et al. HHMM-based chinese lexical analyzer ICTCLAS[C]//Proc of the 2nd SigHan Workshop. 2003: 184-187.

    [9] DEERWESTER S, DUMAIS S, LANDAUER T. Indexing by latent semantic analysis[J]. Journal of the American Society of Information Science. 1990, 41(6):391-407.

    [10] HOFMANN T. Probabilistic latent semantic indexing[C]//Proc of the 22nd Annual Int ACM SIGIR Conf on Research and Development in Information Retrieval.New York: ACM, 1999:50-57.

    [11] BLEI D M. Probabilistic topic models[C]. Communications of the ACM. 2012, 4:77-84.

    [12] BISHOP C M. Pattern Recognition and Machine Learning[M]. Germany: Springer, 2007.

    [13] PHILIP R, ERIC H. Gibbs sampling for the uninitiated[R]. Technical Reports from UMIACS, 2010, 6.

    [14] STEYVERS M, GRIFFITHS T. Probabilistic topic models[J]. Handbook of Latent Semantic Analysis, 2007, 427(7):424-440.

    [15] WENG J S, LIM E P, JIANG J, et al. TwitterRank: finding topic-sensitive influential Twitterers[C]//Proceedings of the third ACM WSDM, 2010.

    [16] GRIFFITHS T L, STEYVERS M. Finding scientific topics[C]//Proc of the National Academy of Sciences of the United States of America, 2004, 101: 5228-5235.

    [17] IDO D, LEE L, PEREIRA F. Similarity-based methods for word sense disambiguation[C]//Proceedings of the Thirty-Fifth Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics, 1997: 56-63.

    [18] KULLBACK S, LEIBLER R. A. On Information and sufficiency[C]. Annals of Mathematical Statistics, 1951, 22(1): 79-86.

    [19] HONG L, DAVISON B D. Empirical study of topic modeling in Twitter[C]//Proceedings of the SIGKDD Workshop on Social Media Analytics, 2010.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索
    Article views (3514) PDF downloads(2887) Cited by()
    Proportional views

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return