5 things to know about Data in the real world:




5 things to know about Data in the real world: 1. It is messy and dirty. Real world data often has issues like missing values, wrongly inputted data, data type mismatch etc. 2. It is disorganized. If you're lucky you get your data from one source directly but many companies have multiple data storage points. 3. Excel sheets and PDFs contain a lot more important data than you can imagine. Parsing them and storing them in machine readable form is a task. 4. Data collection at source is a task as well. Lots of mechanisms have to be put in place to accurately gather data. 5. In research, often datasets are either in the form of small databases or CSV files. In the industry, right infrastructure has to be in place for storing the data in a way that enables easy retrieval. What do you think? #data #datascience

Comments

  1. Over the last few years, the world has seen a meteoric rise in the amount of data we capture about our lives. From simple social media posts to the thousands of hours of video uploaded to YouTube, billions of people around the world now share an unprecedented amount of data about who we are, what we do, and what we think. This explosion ofdatatechnologies are now being used to make better decisions by businesses and governments alike, and has driven the rise of a new industry focused on turning Big Data into actionable insights. As a result, there are now more ways to store, process, and analyze data than ever before.

    ReplyDelete

Post a Comment