wh_news.Rd
Get access to structured posts data from news articles, blog posts and online discussions.
wh_news(token, q, ts = (Sys.time() - (3 * 24 * 60 * 60)), sort = NULL, order = NULL, accuracy = NULL, highlight = NULL, latest = NULL, quiet = !interactive())
token | your token as returned by |
---|---|
q | a string query containing the filters that define which posts will be returned. |
ts | The "ts" (timestamp) parameter is telling the system to return results that were
crawled after this timestamp ( |
sort | by default (when the sort parameter isn't specified) the results are sorted by the recommended order of crawl date. See details for valid values. |
order | If you choose to order the posts by any of the numeric |
accuracy | return only posts with high extraction accuracy, but removes about 30 the total matching posts (with lower confidence). |
highlight | return the fragments in the post that matched the textual Boolean query.
The matched keywords will be surrounded by |
latest | this will return the latest 100 crawled posts matching your query (**NOT** recommended). |
quiet | if |
object of class webhoser
Valid sort values
relevancy
social.facebook.likes
social.facebook.shares
social.facebook.comments
social.gplus.shares
social.pinterest.shares
social.linkedin.shares
social.stumbledupon.shares
social.vk.shares
replies_count
participants_count
spam_score
performance_score
published
thread.published
domain_rank
ord_in_thread
rating
See official documentation for valid filters.
# NOT RUN { token <- wh_token("xXX-x0X0xX0X-00X") rstats <- wh_news(q = '"Programming language"') %>% # use quote marks! wh_collect() # collect results wh_news( q = paste0( '"US President" OR Trump crawled>:', as.numeric(Sys.time() - (3 * 24 * 60 * 60)) ) ) %>% wh_paginate( p = 1, ts = as.numeric(Sys.time() - (3 * 24 * 60 * 60)) ) %>% wh_collect() -> trump # }