In general

 

Our mission is to track the conversations around research outputs online wherever they're happening. To that end we're constantly looking for new sources of data.

 

Specifically what we're looking for are sources that regularly link to scholarly content, though in some cases like newspapers and policy documents we're also able to perform text mining to try to match research mentioned (but not linked to) with published scholarly articles. We can only do this robustly under certain circumstances.

 

Sometimes we're held back by the level of access available and sometimes the source just doesn't contain enough content to be worth tracking, but we know there are plenty of sites that we should track and just haven't gotten round to yet. If you know of any please do let us know with the feedback option or by email.


Research outputs 


We are actively tracking mentions related to the following types of research outputs;

  1. Books
  2. Book Chapters
  3. Journal Articles
  4. Presentations
  5. Reports 
  6. Data Sets
  7. Policy Documents 
  8. Syllabi
  9. And More!

If a research output is from a domain we have whitelisted and contains a persistent identifier we can track it. 

Data sources

 

Altmetric currently tracks the following sources for mentions of research outputs. Where possible we surface the original text of each mention, and in some cases are also able to provide demographic data on the author of the mention. It's crucial to us that all of our data is fully auditable, and that you can see not only how many people are talking about the research, but who they are and what they're saying. 

 

1) Policy documents   

 

We track a wide range of public policy documents for mentions, and are adding more every month.     

 

2) Mainstream media 

    

You can check out the news sources page on the Altmetric website for the latest list of news sources that we track. This list currently extends to over 2,000 English and non-English global news outlets.    

 

3) Blogs 

  

We maintain a manually curated list of over 9,000 academic and non-academic blogs. These are tracked automatically via RSS feeds.     

 

4) Online reference managers 

  • Mendeley       
  • CiteULike   

 

5) Post-publication peer-review forums

  • PubPeer
  • Publons   

 

6) Social media

  • Twitter (public comments and retweets only, no favourites)       
  • Facebook (posts on public pages only, no individual timeline posts and no likes)       
  • Google+       
  • Historical data: Pinterest - We can no longer pick up mentions from Pinterest, but you will still see historical mentions on details pages.        
  • Reddit (original posts only, not comments)
  • Historical data: LinkedIn groups - LinkedIn have now unfortunately closed their data stream so we are unable to pick up new mentions from this source. You will still see mentions made before the stream was closed.
  •    

7) Other online sources

  • Wikipedia     
  • Sites running Stack Exchange (Q&A)     
  • Reviews on F1000     
  • YouTube 
  • Open Syllabus

 

8) Publisher download count data

 

Altmetric can optionally harvest download counts from publishers that make this data available through an API or by bulk download. This data doesn't contribute to the Altmetric Attention Score but can be shown to users through the embedded badges and details pages.



How do Altmetric collect mentions across each data source?

The table below describes the process for collecting data, update frequency and any additional details across all of our sources.

Source name Collection method Update frequency Notes
Twitter                                            Third party data provider API
Real-time feed Demographics, support for retweets, with monitoring of suspicious activity. 
Facebook Facebook API Daily Public Facebook Pages and posts only, with prioritised popular Pages.
Policy documents PDFs collected and scanned from policy sources and repositories
Daily Scanning and text-mining policy document PDFs for references, which are looked up in CrossRef/PubMed and resolved to DOIs. 
News RSS feeds and API Real-time feed Manually curated news sources, with data provided via a third-party provider and RSS feeds direct.
Blogs RSS feeds Daily Manually curated list, harvesting links to scholarly content.
Mendeley Mendeley API Daily Reader counts is number of readers with the output in their Library. Not included in score.
CiteULike
CiteULike API
Daily Reader counts is number of users with the output in their Library. Not included in score.
Post-publication peer reviews
PubPeer and Publons APIs
Daily Peer review comments collected from item records and associated by unique identifier. 
Reddit
Reddit API
Daily Includes all sub-reddits. Original posts only, no comments. 
Wikipedia
Wikipedia API
Real-time feed Mentions of scholarly outputs collected from References section. English Wikipedia only.
Q&A (Stack Overflow)
Stack Overflow API
Daily Scan for links to scholarly outputs. 
F1000 Reviews
F1000 API
Daily Scan for links to scholarly outputs. 
Google+
Google+ API Daily Public posts only.
YouTube
YouTube API
Daily Scan for links to scholarly outputs in video comments.
 Open Syllabus  Static Import from Open Syllabus Quarterly  Link syllabi's contents to HLOM IDs.