Skip to main content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
Hong Kong Baptist University Hong Kong Baptist University Library

Sources for Text-Mining: Social Media Data

Social media data mining collects and analyzes unstructured information (such as posts, comments, tweets, shares, likes, mentions, and clicks) shared on networks like Facebook and Twitter. This kind of information provides a rich source of text data for academic research in a variety of creative ways:

  • Identify current and future trends
  • Capture major social issues
  • Understand public opinions and sentiments on social issues
  • Analyze marketing and journalistic practice
  • Facilitate sophisticated predictive modeling

The Springer's article "Social Media Analytics: A Survey of Techniques, Tools, and Platforms" provides an easy-to-read overview and essential foundation knowledge of social media data mining.

According to another article "Collection Social Media Data for Research", there are three major ways to collect social media data:

  • Do it manually -- It is very time-consuming, and thus not recommended
  • Use web services -- It is the easiest way to collect data, but costs money. Also, you may not be able to get the data you want.
  • Use coding -- It provides the highest level of flexibility, but you have to learn how to code and keep updated on the changing policies of social media platforms

This guide will briefly discuss the second and third ways.

Increasing number of web services are available on the market to help you collect and analyze social media data. Some services are developed with researchers as one of their primarily customers, including Volunteer Science and uMaxData. Some are developed for more general purposes but also commonly used by academia, such as Export Tweet and WebTunix.

Most web services can provide custom-made mining solutions to meet the specific requirements of sophisticated research projects, but they cost money. As an alternative option, the Library has subscribed to a web service plan (with specified data coverage) for the use of the HKBU community.



The tool has been installed in Data Analysis Room. Please book a workstation in advance. Every user can create a personalized account so that previous search queries, search results and analysis can be saved and downloaded.

  • Sources -- Online news, Facebook and Twitter pages of KOLs and public sectors, public blogs and forums, etc.
  • Region -- Hong Kong data only
  • Coverage -- Jan 1 2017 - current
  • Data Mining Dashboard -- provides display of trends, source distribution, communication paths, automatic sentiment analysis, word clouds, top posts, etc.
  • Computer Assisted Content Analysis Module -- support data pooling, data categorization, sentiment analysis, etc.
If you wish to start with coding, you will need to know APIs, R, Python, etc. to do data collection. Please refer to the following resources to learn relevant techniques:


Python   FREE

There are other social media research tools that can be used to collect (and/or analyze) social media data. Some tools are free to use and require little or no programming. Some tools are costly or coding intensive. The following lists provide an overview.

  • Social Media Data Collection Tools
  • Social Media Research Toolkit
  • Library Facility

    Data Software
    Available in the Library

    View full table here.

    Library Services

    Look Out for Semester-Based
    Data Software Training!

    Review previous and upcoming training information here

    Other Relevant Libgudes:

    See other guides related to data management and analytic here

    Find out more

    Feel free to contact me if you have questions about
    Research Data Services

    Rebekah Wong
    Head, Digital & Multimedia Services