Big data is a term for data sets that are so large or complex that traditional data processing application software is inadequate to deal with them. Big data challenges include capturing data, data storage, data analysis, search, sharing, transfer, visualization, querying, updating and information privacy.
In Today’s
world there is about 2.5 Quintillion Bytes of data that is generated every day,
and it’s predicted that the amount of data that is created everyday will be
significantly increased in the times to come. For an Instance, This makes
Analysing Big data a noteworthy field in the times to come.
Current State of Big data
The data is
collected from various sources like mobiles, Cameras, Microphones, Internet of
things. To keep up with the rapid pace of growth of the data there is a
resultant technological development which has resulted in the technological
capacity to store data getting doubled every 40 months since 1980’s.
As of 2012, everyday 2.5 exabytes of
data is generated through various sources and the traditional database
management systems face problems face difficulties dealing with such large
amount of data leading to the development of new and advanced software’s being
developed. The definition of big data might be different for each organisation,
it majorly depends on the scale at which they work.
Why is big data better than other?
When you
talk about data in Rdb’s (Relational Databases), there are 2 types of data viz.
Pristine data (Best data) which is accurate, clean and 100% reliable and the
other one is Bad data which has lot of inconsistencies, a huge amount of time,
money, and accountability is put on to making sure the data is well prepared
before loading it in to the database.
Big data
consists of both Pristine as well as Bad data but the major difference between
Big data and Rdb is that the “Big” in
the big data makes the bad data irrelevant, it has enough volume so that the
amount of bad data, or missing data becomes statistically insignificant. When
the errors in your data are common enough to cancel each other out, and when
the missing data is proportionally small enough to be negligible. When your
data access requirements and algorithms are functional even with incomplete and
inaccurate data, then you have "Big Data".
"Big Data" is not really about the
volume, it is about the characteristics of the data.
Characteristics of Big Data
Big Data philosophy
encompasses unstructured, semi-structured and structured data, however the main
focus is on unstructured data.
There are 5 V’s in Big data
viz. Volume, Variety, Velocity, Variability, Veracity.
Volume
The quantity
of generated and stored data. The size of the data determines the value and
potential insight- and whether it can actually be considered big data or not.
Variety
The type and nature of the data. This helps people who analyze it to effectively use the resulting insight.
Velocity
In this context, the speed at which the data is generated and processed to meet the demands and challenges that lie in the path of growth and development.
Variability
Inconsistency of the data set can hamper processes to handle and manage it.
Veracity
The quality of captured data can vary greatly, affecting the accurate analysis.
Processing Big Data
Technologies like Alpha
& Beta Testing, Machine Learning, Natural Language Processing are used to
process Big data.
Processing Methods like predictive
analytics, user behaviour analytics, or certain other advanced data analytics
methods are used to extract value from data and use it for performance optimization
of a particular firm.
The only
major problem with processing big data is that for such a huge amount of data the
requirement of processing power is also high which can be met by using advanced
computing machines and softwares.
Uses Of Big data
It is also
used in the fields such as :
1. Banking and Securities
2. Communication, Media and
entertainment
3. Healthcare Provision
4. Education
5. Manufacturing
6. Discovery of Natural Resources
7. Insurance
8. Retail and wholesale trade
9. Transportation
10. Energy and utilities
Analysis of
data sets can find new correlations and Patterns to
· Spot business trends,
· Prevent diseases,
· Combat crime.
Scientists,
business executives, practitioners of medicine, advertising and governments
alike regularly meet difficulties with large data-sets in areas including:
- Internet search,
- Fintech,
- Urban informatics,
- Business informatics.
Conclusion:
- There is substantial real spending around big data
- To capitalize on big data opportunities, you need to:
- Familiarize yourself with and understand industry-specific challenges
- Understand or know the data characteristics of each industry
- Understand where spending is occurring
- Match market needs with your own capabilities and solutions
- Vertical industry expertise is key to utilizing big data effectively and efficiently.
Author: Keyur Dhavle
Comments
Post a Comment