Chapter 1: What is Big Data?
This chapter first describes how we measure data, and how its creation has skyrocketed in recent years. We then define Big Data and psychology for the purposes of the book, and motivate why their intersection is important to study. The chapter ends with a guide to how to use the book, and brief summaries of the upcoming chapters.
- Class survey
- A HTML file (bigdataregistration.html) with the client-side code (HTML and Javascript) for the survey.
- A PHP file (bigdataregistration.php) with the server-side code (PHP, MySQL) that handles saving of the survey data. You will need to have your own web server database to save their data. As an example, I host my website and data with IONOS, but there are many web hosts available.
- If you do not want to deal with creating a database to save the survey data, there is also a different HTML file (bigdatareg-noserver.html) that instead has the student send you a code-generated e-mail with their responses.
- I include MATLAB code (viewmousemovements.m and also uses registrationpage.png) to generate videos of mouse movements from the students' data. MATLAB is not the optimal coding language to show this but it works! (Implementing this in Python is on my to-do list).
- How much data do we make each day?
- How is data measured?
- What was it like moving the world's first hard drive?
Here is the code for running the class survey for the beginning of class. The ZIP folder contains:
Here's a 2025 blog post on Exploding Topics with statistics on how much data is generated each day.
Here's an older article (but with a great visualization) on Forbes of how much data is collected every minute.
Here's an easy-to-understand primer on how the binary number system works.
Here's a page that can convert between decimal and binary numbers.
Here is a video about the world's first hard drive, from IBM in 1956: