First, you should load recharts:

library(recharts)

1 Introduction

WordCloud has only 1 type: wordCloud

The keys are:

  • character x represents the words
  • numeric y represents the frequency of the words
  • series is not linked with legend, but linked with colors

2 Function Call

echartr(data, x, y, <t>, <type>)
Arg Requirement

data

source data in the form of data.frame

x

character independent variable. Only the first one is accepted if multiple variables are provided.

y

numeric dependent variable. Only the first one is accepted.

series

data series variable which will be coerced to factors. Only the first one is accepted if multiple variables are provided.

t

timeline variable which will be coerced to factors. Only the first one is accepted if multiple variables are provided.

type

‘wordCloud’

3 Showcase

3.1 Data Preparation

Fetch the Baidu buzz hotword web page and parse it into a data.frame Keyword and Freq.

For this purpose, we composed a function getBaiduHot to parse the Baidu Hot Word Trend web page.

getBaiduHot <- function(url, top=30, HTMLencoding=NULL){
    baiduhot <- paste0(readLines(url), collapse="")
    charset <- gsub('^.+charset=([[:alnum:]-]+?)[^[:alnum:]-].+$', "\\1", 
                    baiduhot)
    if (is.null(HTMLencoding)) if (!is.null(charset)) HTMLencoding <- charset
    baiduhot <- stringr::str_conv(baiduhot, HTMLencoding)
    hotword <- gsub(".+?<a class=\"list-title\"[^>]+?>([^<>]+?)</a>.+?<span class=\"icon-(rise|fair|fall)\">(\\d+?)</span>.+?","\\1\t\\3\t\\2\t", baiduhot)
    hotword <- enc2native(gsub("^(.+?)\t{4,}.+$","\\1", hotword))
    hotword <- t(matrix(unlist(strsplit(hotword,"\t")), nrow=3))
    hotword <- as.data.frame(hotword, stringsAsFactors=FALSE)
    names(hotword) <- c("Keyword", "Freq", "Trend")
    hotword$Freq <- as.numeric(hotword$Freq)
    hotword <- hotword[order(hotword$Freq, decreasing=TRUE),]
    return(hotword[1:top,])
}
hotword <- getBaiduHot("http://top.baidu.com/buzz?b=1", HTMLencoding = 'GBK')
knitr::kable(hotword)
Keyword Freq Trend
11 小姑娘你火了 116955 rise
10 曝美女兵裸照丑闻 115900 fair
12 试探男友谎称绑架 106834 fall
9 富二代玩枪建工厂 76881 rise
13 男子谋生杀猫卖钱 42903 fall
16 儿生日父亲送毒品 39328 fall
14 男子撞脸达尔文 38389 rise
17 老太眼内8条活虫 33053 rise
5 铁路运行图将调整 27491 rise
1 清洁工擦窗困楼外 25871 rise
30 两架小型飞机相撞 21520 fall
46 耐克气垫门曝光 21170 rise
15 敖厂长被威胁事件 20979 rise
6 蒙冤16年回老家 18929 rise
7 离婚冷静期通知书 17932 rise
24 小学课文被指杜撰 12642 fall
2 贾静雯三胎再产女 12292 rise
27 偷上万元发红包 11621 fall
19 香港旺角暴乱罪成 11488 fall
45 安以轩宣布结婚 10655 fall
23 陈妍希短裙秀美腿 10652 fall
3 秘鲁洪灾泥石流 10441 rise
22 10亿建豪华校区 10137 fall
41 沈梦辰揭澡戏内幕 9333 rise
20 李维嘉终于笑了 9205 fall
21 洋妞街头脱衣暴走 9023 rise
18 男婴出生就18岁 8981 rise
25 火锅底料用560次 8295 rise
31 李维嘉被经纪人骗 8161 rise
26 捡到钱包要求陪睡 6526 rise

3.2 Basic Plot

Only provide x and y.

echartr(hotword, Keyword, Freq, type='wordCloud') %>% 
    setTitle('Baidu Hot Word Top30 (realtime)', as.character(Sys.time()))

3.3 Color by Series

We want to group the hot words. Let’s assign a series variable ‘Trend’. The ‘rise’ series and ‘fall’ series are colored differently.

echartr(hotword, Keyword, Freq, Trend, type='wordCloud') %>% 
    setTitle('Baidu Hot Word Top30 (realtime)', as.character(Sys.time()))

3.4 With Timeline

Let’s compare realtime, today, and 7-days hotwords.

First, get the other two web pages and rbind the datasets.

hotword$t <- 'Realtime'
hotword1 <- getBaiduHot("http://top.baidu.com/buzz?b=341&fr=topbuzz_b1",
                        HTMLencoding = 'GBK')
hotword1$t <- 'Today'
hotword2 <- getBaiduHot("http://top.baidu.com/buzz?b=42&c=513&fr=topbuzz_b341",
                        HTMLencoding = 'GBK')
hotword2$t <- '7-days'
hotword <- do.call('rbind', list(hotword, hotword1, hotword2))
hotword$t <- factor(hotword$t, levels=c('Realtime', 'Today', '7-days'))

Then come up with the chart.

g <- echartr(hotword, Keyword, Freq, t=t, type='wordCloud') %>% 
    setTitle('Baidu Hot Word Top30')
g

4 Futher Setup

Then you can configure the widgets, add markLines and/or markPoints, fortify the chart.

4.1 setTheme

g %>% setTheme('dark', palette='manyeyes')

You can refer to related functions to play around on your own.