Data science helps companies know us better than ever before

5 min readMar 11, 2021


Have you ever had the feeling that some companies are even knowing your secrets better than your parents and close friends? I had such doubts from time to time in the big data era. I remembered that once I had a severe conflict with my parents and during that time, I always received recommended articles in my email with topics on how to communicate with parents sent by the media I subscribed to. Obviously, they have captured my browsing behavior and accurately targeted my demands with its algorithms. I would say that some of the articles chosen by the algorithms did help me a lot to finally achieve a deal with my parents and I have stayed subscribed with the media for years. I think the key for the media to retain me is that it successfully leverages its big data to win my emotional attachment.

From my point of view, the biggest part of leveraging data-driven decision-making in business always falls on understanding the demands of your customers and creating emotional attachment to your customers. In the past quarter, I worked with one of the biggest self-storage companies in the world for my practicum as a data scientist and had a chance to learn about why the company values data so much and its application of data science.

Self-storage companies are always thought to be of the traditional business sector which is somewhat like real-estate. It’s true in a sense because the development strategy of the self-storage industry highly depends on expanding lands and properties, and getting a high occupancy rate for properties. When talking about data science applications, people usually relate it to category-defining Internet companies like Amazon, Google, Netflix, LinkedIn, or Facebook, all of which have built a data-oriented culture and subsequently, have established competitive differentiation using data and analytics with every iteration of their business model. However, since more and more traditional sector companies have developed e-commerce platforms, they are empowered with the capability to use tons of data generated each day online as a critical differentiator as stated in Donald Hambrick and James Fredrickson’ Strategy Diamond to enhance customer engagement and add customer value.

Source: Adapted from Donald C. Hambrick and James W. Fredrickson, “Are You Sure You Have a Strategy?,” Academy of Management Executive 19, no. 4 (2005): 51–62.

Imagine you are going to move house to another city today, and you have so many things with you that you need to store for temporary before moving them to your new home, what will you do then? Probably you will go to Google or Being and search “self-storage around me”, then you open the homepage directed by the Google or Being Map of one of the self-storage companies, put your preference into the filter, browse different properties, find the one suits you best and put your contact information to create an account with the company, make a reservation and finish the payment. Actually your behaviors of the whole above process can be well captured in a format of different parameters including your unique visitor ID, the marketing channel you are sourced from, time spent and the number of hits you give on each page, the page where you bounced off, the device you use to make the reservation, the transaction revenue associated with your order and etc. With those parameters, we could run machine learning algorithms to segment customers into different clusterings based on their similarities of those parameters. That’s one of the applications of machine learning in the modern self-storage industry. Once we got the customer segmentation, we could use more targeted marketing strategies for different segments of customers.

In addition, we could also use the web data and third party data to predict the demands of customers. For example, if we know a customer who is a university student and will graduate in September this year, chances are that he will have the demand for a self-storage property at that time and we could probably give him some promotions to attract him/her to make a reservation with us. And we could also use the state to state migration flows data to build a model to identify the number of potential customers who need our services for moving house. The basic logic behind such prediction models is that the big life events are triggers (There are classic triggers called “4D”: death, disaster, debt, and divorce) for customers to have a demand for self-storage. That being said, the real driver of customers’ demands is not those life events but the customer’s emotional attachments as addressed by the VP of Marketing at the self-storage company I work with. Successful marketing at self-storage companies should aim at creating the feeling for customers that “That’s my cherished stuff, I may never get this again so I want to keep them”. We could leverage data science to create such feelings for customers by targeted marketing in a manner which makes customers want to keep things in our property longer and more frequently rather than simply throwing stuff away. It will help increase customer lifetime values, which is an extremely important performance indicator nearly across all industries.

While we have to recognize the strength of using data science in the industry, there are also risks. Kord Davis and Doug Patterson raised elements of big-data ethics that companies really need to be aware of and cautious with.


Companies need to determine appropriate and inappropriate action to get or build customers’ identities because though big data enables companies to easily summarize, aggregate, or correlate various aspects of customer identity, it can always be conducted without customers’ agreement.


It is an era where we have the wonder of whether we have lost or gained control over our ability to manage how the world perceives us as authors point out in the book. According to the research by Latanya Sweeney from Carnegie Mellon University, it was found that 87% (216 million of 248 million) of the population in the United States had reported characteristics that likely made them unique based only on {5-digit ZIP, gender, date of birth}. That is to say, the big data has secretly transferred the determination of the degree to expose privacy from customers to the companies who own the data. Therefore, it’s companies’ responsibilities to honor and respect customers’ privacy when gathering data.



World Economic Forum (WEF) describes personal data as a “new economic asset class” (data as currency). On the other hand, companies are generating more revenue from the use of personal data. So it can be safe to think about the ownership issues before we use the data for business purposes.


When data science helps companies know us better, as customers, we are provided with better services; but as individuals, we may have already paid a cost with our identity, privacy and ownership of our personal data. But I believe there should be a win-win relationship between customers and companies once we balance the power and ethics of data science.




Would like to be that unnoticed and that necessary.