Data, why nowadays is it so important?

Wellington Martins dos Santos
3 min readSep 5, 2020

--

The main ideas which I consider important from the book: Data Science for Business: What do you need to know about data mining and data-analytic thinking.

Everybody produces a massive amount of data everyday and google, amazon, Instagram and facebook, the big companies use it to provide de best customer experience, for example, according to what you have access suggeste you products or even delivery your content to the right customer and potential buyers. With that in mind, the ability to organize and understand the huge amount of data it is the dude of data scientists. These professionals are basically looking for patterns and relationships, to understand everything about us even without know the person in real life — Crazy isn’t it? — and then decide what is the next step, better than that, improve the process of decision-making in the business context.

As it is possible to realize, at the end the decision is the main focus, or again, a better decision. For that, there is mainly two types. The first one is decide which discoveries need to be made within data. For example, it is known that costumer have a buying pattern which changes only when important events happen in their life like the couple first baby born. The second one pertains to decisions that repeats many times and if improved, better outcomes are possible. If the pattern of contracts defaction by customer are understood actions before it happens could be taken and focus on the profitable clients.

For those decions-making process be on the game, huge amount of data are needed, and the concept Big data get into the game, once the amount of data is too large for traditional data processing systems.

During the application of algorithms and methods usually find in the data science world there are several main concepts and ideas important to look at.

1th — One of the main problems of data scientists it is known as Overfitting. How it is mentioned in the book, ‘if you look too hard at a set of data, you will find somenthing’.

2th — Context consideration it is another important factor that needs to be considered during the process.

3th — In a sea of possible algorithms and methodology possible, there are two main — Classification, regression, similarity and clustering. Classification regars to the define the pattern which the data set has. For example, defining the best clients to provide advertisement considering different range of age. In case of regression, the idea behind is different, models are looking at a value of some variable for that individual. For example, how much money people are gonna have at the end of the year if invest in the technology companies, of perhaps, how strong will be the individuals working out 3 times a week after 10 weeks, how much a service will be used for a gi ven customer. Regression and classification are related, but different, regression predicts how much something will happen and classification whether something will happen. Similarity it is about to identify similar individuals considering many characteristics, or entities. Most common method for making product recommendation. Clustering is common used in the exploratory part, to group individuals by their similarity but without any specific purpose. This solves problems like, ‘What products should we offer or develop?’. All these methods will be discussed depth in the next chapters.

--

--

Wellington Martins dos Santos
Wellington Martins dos Santos

Written by Wellington Martins dos Santos

Sports Scientist, Physical Trainer so far but I still wanna be a Fitness Funcional Athlete and programmer — 27 years old

No responses yet