How Big is Big Data?





Big data is a term for data sets that are so large or complex that traditional data processing application software is inadequate to deal with them. Big data challenges include capturing data, data storage, data analysis, search, sharing, transfer, visualization, querying, updating and information privacy.
In Today’s world there is about 2.5 Quintillion Bytes of data that is generated every day, and it’s predicted that the amount of data that is created everyday will be significantly increased in the times to come. For an Instance, This makes Analysing Big data a noteworthy field in the times to come.

Current State of Big data

The data is collected from various sources like mobiles, Cameras, Microphones, Internet of things. To keep up with the rapid pace of growth of the data there is a resultant technological development which has resulted in the technological capacity to store data getting doubled every 40 months since 1980’s.
            As of 2012, everyday 2.5 exabytes of data is generated through various sources and the traditional database management systems face problems face difficulties dealing with such large amount of data leading to the development of new and advanced software’s being developed. The definition of big data might be different for each organisation, it majorly depends on the scale at which they work.

Why is big data better than other?

When you talk about data in Rdb’s (Relational Databases), there are 2 types of data viz. Pristine data (Best data) which is accurate, clean and 100% reliable and the other one is Bad data which has lot of inconsistencies, a huge amount of time, money, and accountability is put on to making sure the data is well prepared before loading it in to the database.
Big data consists of both Pristine as well as Bad data but the major difference between Big data and Rdb is that the “Big” in the big data makes the bad data irrelevant, it has enough volume so that the amount of bad data, or missing data becomes statistically insignificant. When the errors in your data are common enough to cancel each other out, and when the missing data is proportionally small enough to be negligible. When your data access requirements and algorithms are functional even with incomplete and inaccurate data, then you have "Big Data".
"Big Data" is not really about the volume, it is about the characteristics of the data.

Characteristics of Big Data

Big Data philosophy encompasses unstructured, semi-structured and structured data, however the main focus is on unstructured data.
There are 5 V’s in Big data viz. Volume, Variety, Velocity, Variability, Veracity.

Volume

The quantity of generated and stored data. The size of the data determines the value and potential insight- and whether it can actually be considered big data or not.

Variety

The type and nature of the data. This helps people who analyze it to effectively use the resulting insight.

Velocity

In this context, the speed at which the data is generated and processed to meet the demands and challenges that lie in the path of growth and development.

Variability

Inconsistency of the data set can hamper processes to handle and manage it.

Veracity

The quality of captured data can vary greatly, affecting the accurate analysis.

Processing Big Data

Technologies like Alpha & Beta Testing, Machine Learning, Natural Language Processing are used to process Big data.
            Processing Methods like predictive analytics, user behaviour analytics, or certain other advanced data analytics methods are used to extract value from data and use it for performance optimization of a particular firm.
The only major problem with processing big data is that for such a huge amount of data the requirement of processing power is also high which can be met by using advanced computing machines and softwares.

Uses Of Big data

It is also used in the fields such as :
1.    Banking and Securities
2.    Communication, Media and entertainment
3.    Healthcare Provision
4.    Education
5.    Manufacturing
6.    Discovery of Natural Resources
7.    Insurance
8.    Retail and wholesale trade
9.    Transportation
10. Energy and utilities

Analysis of data sets can find new correlations and Patterns to
·       Spot business trends,
·       Prevent diseases,
·       Combat crime.

Scientists, business executives, practitioners of medicine, advertising and governments alike regularly meet difficulties with large data-sets in areas including:
  •        Internet search,
  •        Fintech,
  •        Urban informatics,
  •        Business informatics.

Conclusion:
  •       There is substantial real spending around big data
  •       To capitalize on big data opportunities, you need to:
  •        Familiarize yourself with and understand industry-specific challenges
  •        Understand or know the data characteristics of each industry
  •        Understand where spending is occurring
  •        Match market needs with your own capabilities and solutions
  •      Vertical industry expertise is key to utilizing big data effectively and efficiently.



Author: Keyur Dhavle

Comments