![]() ![]() ![]() ![]() |
|
Data is a set of values of qualitative or quantitative variables that scholars draw upon to support their claims and/or produce new knowledge.
We will go over the six steps of the Data Life Cycle with corresponding tools recommended to you.
Before collecting data, it is best to plan ahead and ask yourself: What types and formats of data will be collected? Is there any copyright issue involved? What are the best approaches to store and back up data? You may go to the Research Data Management library guide for more information.
Data can be collected…
Library Services:
If you have difficulties in filling out research data management plans (DMP) requested by publishers or fund agencies, please feel free to contact our Scholarly Communications Librarian Pauline Lam at lamlhp@hkbu.edu.hk.
This step involves data inputting (if the raw data is not collected in a digital format), data conversion (from one system to another system, or from one format to another format), and data cleaning.
Data cleaning requires tedious and time-consuming manual work, but its importance should not be underestimated. Proper data cleaning can prevent researchers from coming back to this step at a later stage of the research and avoid drawing false conclusions.
The following data cleaning tips can serve as a starting point:
Some of these points are mentioned in EliteDataScience. Go there for a more comprehensive explanation.
Software Recommendations:
This is the most challenging but also most exciting part of the cycle. It can involve quantitative analysis, qualitative analysis, machine learning, etc.
This guide does not intend to cover basic statistics that can be found on the Internet easily. (If you have no idea which internet sites to use, you may start with Statistics How To.) We hope to introduce commonly-used software tools instead.
Software Recommendations:
Quantitative Analysis Software
Qualitative Analysis Software
Programming Languages to Provide an Integrated Support from Data Preparation to Web Applications
The following two programming languages are quite powerful and can support many aspects of the data life cycle, including web crawling, statistics, data manipulation, machine learning, data visualization, web applications, etc.
Library Services:
Stay tuned for our semester-based Research Data Tools Series training if you want to learn how to use these software. We also offer a limited number of course-embedded basic training each year.
This step involves short-term measures such as proper file version control during a research project and long-term data archiving measures to migrate data to the best format and store it in the most suitable medium for your or your company's future use. You may learn more about this through TechTarget.
Tool Recommendations:
Version Control Tool
Data storage is more on internal use of data, but data sharing refers to open data that can be accessed and re-used by the public for free. Open data is not only a trend but also an obligation that researchers are recommended to meet for the benefits of academia and the society. Some major publishers also request authors to share their data, e.g., Nature and Science.
Data can be shared in its original form (after removing privacy and sensitive information) through publicly accessible data repositories. Researchers can also choose to share their data through data visualizations or developing interactive web applications.
Tool Recommendations:
Data Repositories
There are many data repositories available online for you to share data sets; some are subject-based, material-type specific, or region specific. If you are new to this area, you may want to start from the following three platforms:
You can also develop your own data management / sharing systems using open source data platforms:
Data Visualization Software
Library Services:
The Library provides Digital Scholarship Services to help faculty members develop interactive web applications for public access. We offer Digital Scholarship Grant and a track of non-grant application.
There are many free and subscribed data resources available for researchers to re-use. We will prepare another library guide for data resources. Please stay tuned.
Watch these videos on why and how HKBU researchers share data.
Watch this video on the importance of data storage, documentation, and file formats.
Useful
Online Learning Resources
for Data Science
DataCamp (https://www.datacamp.com/)
Coursera (https://www.coursera.org/browse/data-science)
edX (https://www.edx.org/course?subject=Data%20Analysis%20%26%20Statistics)
codeacademy (https://www.codecademy.com/)
Feel free to contact me if you have questions about
Research Data Services.
Rebekah Wong
Head, Digital & Multimedia Services
rebekahw@hkbu.edu.hk
Library Home |
Opening Hours |
Contact Information |
Location Map |
Privacy Policy |
Site Map |
Intranet |
Copyright © 2010-2019. Hong Kong Baptist University Library. All rights reserved. |