Big Data System (BDS) is a digital transformation from traditional structured data processing (RDBMS) to using Hadoop environment and Natural Language Processing (NLP) for unstructured data. Indeed, it facilitates better and faster decision making. Big Data has come like a tidal wave, which continues to move swiftly, across most industries/ organizations globally. Concurrently, a number of supporting technologies have also emerged to make it possible to store and quickly process vast data, flowing from anywhere to anywhere, at any time, in a variety of forms and formats. BDS helps by quick extracting relevant information for better decision-making. Big Data is one of the most emerging technologies. It has great potential to impact our working culture, business processes/ practices and business strategies.
Of
late, social media has opened floodgates to data exchange to
interact across the world on a 24x7 basis. There are no geographical
boundaries or time zones or any fixed format or type of data. Earlier,
various business houses/enterprises have been dealing with large data but using
their own Leased Lines and Data Centers. These configurations cannot cope with
today’s data deluge, which includes structured (RDBM), Semi- Structured (value
paring as in Mango DB) and Unstructured (text files, video clips, photographs,
comments (likes), emails, posts on social media like Facebook, Instagram,
Telegraph, WhatsApp’s, SMSs). Most of the social media data is unstructured and
non-relational, requiring a high-tech workforce to pre-process and make it
available to the end-user, in the required format. Fortunately, with easy and
speedy communication connectivity available through the internet and future 5G
mobile communication networks, one can organize BD using Cloud Computing. To
meet such demands software tools like Hadoop, MapReduce, Pig, Hive, Ruby,
Python are essential to deal with the Big Data environment.
Google, Oracle, IBM, Microsoft, Rack Space and
Amazon are already using/offering their cloud services to handle BD. These IT
world leaders provide relevant information to their customers for marketing,
sales, CRM, SCM and/or political campaigns. It is estimated that India will
have a 32% share of the Big Data global market by 2025. Despite initial
apprehensions about data ownership, security and privacy, Cloud Computing and
Big Data are the new growth engines for any business.
Rapid Growth of Big Data. Due
to the easy availability of Cloud Computing, Machine Learning (ML) and smarter
Sensors, BD continues to grow very rapidly. After initial apprehensions,
a large number of industries/organizations are now adopting Big Data
technology. Some main factors for the rapid growth of BD are:
·
Availability
of Cloud Computing, with the elasticity of resources, assured security and
integrity of data.
·
ML with
improved algorithms, for information processing and decision making.
·
Advances
in NLP for handing Multilingual text and voice messages.
·
Availability
of, smarter and compact size data sensors and actuators.
·
IOT
providing real / near-real-time data, from multiple locations.
·
Better
network connectivity, through improved Internet services and the Arrival of 5 G
mobile communication networks.
·
Demand
by Health care services for better patient management, utilization of resources
and research for disease control like Corona Pandemic
Characteristics of Big Data. A traditional
database even if it is of a few terabytes size is not considered
Big Data, since that can be handled using traditional RDBMS like
Oracle, MySQL, IBM- DB2. In large-size enterprises, the concept of data
warehouses and data mining has been in vogue for over 15 years. These
organizations have distributed computing environments using their own leased
lines or private cloud using internet facilities. Big Data is a new
approach to tackle data problems related to unstructured and multi-format data
which are unsolvable using traditional tools of RDBMS There are four distinct
characteristics of Big Data, popularly called (V4). These are
briefly described below:
·
Volume: It is related to very high volumes of data in the
range of Terabytes, and even Pico bytes. Indeed, data is very huge and it keeps
growing continuously at a great speed and round the clock. (24x7) basis,
throughout the year.
·
Variety: Data is organized in multiple structures, ranging
from raw (unstructured) data, semi-structured data and structured (stored in
rows and columns) data. To make things even more complex, data can be
text, an email with and without attachment, SMS, audio clip, WhatsApp,
Telegram, Tweet, Instagram, photo, video or sound clips and in any
language/format.
·
Velocity: Data from registered customers, clients, business
partners could be coming through normal channels is expected formats and data
can come any time through leased data lines or the internet. However, data from
social media networks, feedback from online trading and front-line press
reporters could come any time. Data could also come simultaneously from many
locations on the globe.
·
Veracity: It relates to the accuracy and trustworthiness of
data, which forms the basis for timely and accurate decision making.
Sources of Big Data. There are basically
three categories of data;
·
Internal Data. The organization generates its own data and has full
control over it. This data includes the corporate database, Internal documents
(SOPs, policies, Instructions), in-house call centers data, website logs, data
coming from sensors and controllers deployed at various locations.
·
External Data. It is public data or the data generated outside the
organization. As such, the organization neither owns it nor controls it. This
data includes- Social Media data like Facebook, Telegrams, Statistical data,
Public Domain data and Machine Learning data.
·
Environment Data. This relates to weather data, soil data, water sources data, road/ sea/ air data, healthcare data,
Need
for Big Data. Accurate and timely information is the
key to success in any business. Earlier, the DBMS environment consisted of
an RDBMS (Oracle, DB2, MS SQL, MySQL) as back end, data schema and
a web server. This traditional RDBMS approach is too complex to handle present-day
Big Data. Today, we deal with thousands of Gigabytes of Data in various
formats coming as LinkedIn posts, Blog posts, Tweets, Facebook
posts, Instagram, and Telegram messages. Social Media network
interactions in the form of text, video clips, photos are generating huge data
traffic on a 24x7/365 basis. Some data is structured and stored in
a traditional RDBMS way, while other data, including documents related to
customers, service records, and even pictures and videos, are stored in
unstructured forms. In addition, there are large data being generated by
machines and sensors deployed inside the machinery, on the ground and in aerial vehicles like aircraft, drones and satellites. Other external
information sources are human-generated on social media. Such diverse data
is flowing across various industries, institutions, healthcare centers,
Research and Development ( R&D) labs and Government
Departments. Therefore, you have, to think about managing data
differently and apply Big Data technology.
BD Cloud Computing Environment. BDS is heavily dependent on
Cloud Computing and IoT environments where data centre nodes continuously log
messages and track various transactions. It is important to gather real /
near real-time data and store it on multi-servers' clusters but still more
important is to extract hidden information and present comprehensive
report to the customer/ user in a format he/she can easy comprehend and make better decisions. This huge volume of data requires parallel processing and
a special approach to store data on multiple cluster computers (nodes). In
addition, Big Data solution needs automatic scalability and recovery. To cope
with the ever-growing data volume, the system should allocate more nodes, and the
data will be redistributed among them automatically and seamlessly. All this is
now possible by the easy availability of a cloud computing environment.
Big Data Architecture Big Data represents a “log of
records” where each record describes some event, like a purchase in
a store, customer visit to a retail store, a web-page viewed by an online buyer, a
sensor fed data at a given moment, customer online feedback or a
short message (like a comment) on a social network. Architecture like that
of RDBMS can’t handle Big Data complexities, bug fixing and time-bound tedious
operations. As data volume and complexity increase and customers expect
faster response time, you need a different architecture for data storage,
manipulation, and displaying timely and accurate information. Big Data systems
should efficiently handle data volume, complexity and scalability aspects.
A number of Big Data architects were developed and deployed by Google, IBM
and Microsoft. Amazon who started a bit late had caught up fast and has
created Dynamo, a new distributed DBMS. The open-source community came up fast
and gave a boost to Big Data by evolving new architecture and software tools
like Hadoop, Hive, Pig, HBase, Apache Sqoop, and Mongo DB. As of 2022,
there is a number of architectures in use for Big Data. One simpler
approach is adopting Lambda Architecture which has become popular as
it avoids the complexities of traditional architecture and is easy to take
off.
Some Desirable features of Big Data architecture are:
• Simple Design. As we know, a complex
system design is more likely to develop faults and harder to debug and
maintain. To overcome complexity, the design of algorithms and modules should
be simple.
• Ad-hoc Queries. The database should
be able to handle ad-hoc queries easily and efficiently.
• Extensibility. It should be easy to
incorporate changes when a customer makes a change in their business
process/rules. Thus, the system should be able to accommodate additional
functionality with minimal development effort and cost.
• Quick response. . In traditional systems,
Software Development Agency (SDA) carries out the Load test and Stress test,
using software testing tools like Mercury Run to ensure acceptable response
time during peak load. The SDA fine-tunes hardware devices and carries out
optimization of software design to meet customer requirements. BD system should
be very fast while reading, updating and retrieving information for display on
the screen. Customers become impatient if the response is slow. Most
applications require a response within a few milliseconds
• Resilience. Big Data systems must
be fault-tolerant to continue performing reliably and efficiently,
even when some servers, in some clusters, go down. The system should be
more human-fault tolerant. Its recovery mechanism should be so efficient
that the end-user does not feel any disruption.
• Consistency of Data. RDBMS based
distributed databases have issues related to the consistency of data, the
duplicity of data, concurrency of data and maintaining backup data at multiple
locations. Big Data systems must be robust enough to avoid such
limitations.
• Scalability. Scalability is the
ability to maintain consistent performance whenever there is a sudden surge or
drop in incoming data.
• Wider Applicability. Big
Data systems should support a wide range of applications in sectors like
Financial, Insurance, Healthcare, Banking, e-Commerce, social media Analytics
and Scientific applications.
• Minimal maintenance. The Big Data
systems should be able to carry out any scheduled maintenance without any
slowdown in response time or any inconvenience to the customers.
Software
Tools. Collecting huge amounts of unstructured data from various sources, in
various formats and various languages helps the end user only when the system
can quickly aggregate data, process data and display meaningful information to
the user. This task is carried out by the software programmers who apply the Hadoop
framework and necessary software tools to drill down and extract data relevant
to the user for decision making. Some important software tools used by
software experts are briefly given below:
• Hadoop. Hadoop is an open-source written in Java and the most popular framework for working in a Big
Data environment. It is a framework used for distributed storage of
very large data and is capable of parallel data processing. Hadoop breaks large
size data into smaller blocks to be processed separately on different data
nodes (servers). Similarly, Hadoop automatically collects and outputs data
across the multiple nodes and uses MapReduce to compile those into a single
output.
• MapReduce. It is a framework to process
large unstructured data sets in a distributed manner by using a large number of
nodes.
• HDFS. Hadoop Distributed
File System (HDFS) supports V4 multiple files to be
simultaneously stored at multiple locations and processed concurrently.
The customers need not worry/ about the location of their files as those could
by stored on any server, of any cluster and anywhere in the world. To the
end-user, these data files appear to be in one location.
• Pig. Pig is a scripting
language and essential component which sits on top of the Hadoop framework for
processing large data sets. This is an open-source alternative to Hadoop and
MapReduce
• Hive. Like, Pig, Hive is the
Hadoop framework component that sits on top of the Hadoop framework for processing
large data sets. Hive uses an interpreter to transform SQL query to MapReduce
Code. For this user need not write any code in Java or Python. Hive is an open-source alternative to Hadoop and MapReduce. However, there are certain jobs
that can be executed more effectively using Hadoop MapReduce rather than using
Pig or Hive scripts.
• Apache Spark. It is a
framework used for in-memory parallel data processing, which makes near
real-time analytics possible.
• HBase. It is a NoSQL
database that allows for the high-performance processing of information at
a massive scale.
• Mongo DB. It is an open-source
database with the capability to handle both structured as well as unstructured
data. Mongo DB is quite popular for storing, processing and analyzing Big
Data.
• Sqoop. This is an Apache software
tool used for two way data exchange between RDBMS and Hive Database (HBase).
• Impala. Impala is a very handy
software tool that uses SQL queries to access data directly from HDFS.
Big Data GPS Integration. Global
Positing System (GPS) helps us locate vehicles, ships/ boats, and people
through satellite tracking. Integrating GPS with Big Data systems has a great help in moving vehicles, boats, ships or people. Let us
say you are driving on the expressway from the city - A to the city -B and GPS can
use real-time data to determine your location and likely approach the location
of Petrol Pumps, Restaurants or Resting place for a short halt. Big Data
also has information about your liking a particular type of food. The
integrated system knows the time of the day and sends a message on your smartphone or vehicle dashboard that you are just nearing your Favorite restaurant. Indeed, you will feel excited about getting free and personalized
info reaching you well in time.
Big Data Applications. Big Data
Technology is a boon for large enterprises operating globally, to grow fast and
provide greater satisfaction to their channel partners, customers and
distributors (Retail stores). Many BD applications are already in use and many
more are on their way. Some common examples are briefly given in subsequent
sections.
· Customer Shopping. Amazon uses Big Data to have recorded our
buying habits, financial capacity and likes. It even informs us of new brand
arrivals. BD system knows, what you want to buy. There are many Big Data
service providers who provide shopping data to various Malls and
business houses. This is a win-win strategy for earning big revenue both for Big
Data Software companies and shopping malls.
·
Cinema / Theatre
Ticketing. One popular example is
Netflix, which keeps a record of your choice for movies and the frequency of visits
to various theatres/ cinema halls. Netflix system can automatically
recommend you the list of movies being shown during / coming weeks in various
theatres/ cinema halls. This can trigger your interest to buy tickets.
·
Road Route planning. For global road navigation on a 24x7 basis, Google Map
uses Big Data and special software tools which tell vehicle drivers the fastest
route with lesser traffic congestion. With Google Map fully integrated with
GPS, the system guides us all along the route, through voice about approaching
Fuel stations, Restrooms, Restaurants, Picnic Spots and even where to turn. All
this makes our travel stress free and enjoyable.
·
Analytic for
Preventive Maintenance. Based on a
field survey of a vehicle's performance, , say Mercedes or Toyota can recall of
particular brand/model of vehicles for preventive maintenance/ retrofitting of
a critical component. This delights the vehicle owners.
·
Customer
Analytics (CA). Big Data is well
suited for CRM functions particularly, for companies having a global business.
CA is a software tool designed for CRM functions like tracking customers’
likes/dislikes, wish-list, buying habits, frequency of buying and financial
capacity. For instance, a multinational company ABC in India has a global reach
with over 5 million customers, 1500 retailers. ABC
has six manufacturing plants, two in India and one each in, Sri Lanka,
Bangladesh, Nepal and Dubai. To manage the requirements of such a large number
of customers and retailers for collecting, stocking and distributing goods, the Corporate Head Quarter (CHQ) at Delhi needs to analyze a huge data of from many formats and from multiple locations. For good decision making, CHQ needs timely, accurate and well-formatted data. This is an ongoing process
on a 24x7 basis. CA is equally beneficial for all the customers and retailers
as well as CHQ. Let us take a typical
case of a customer “X” who is visiting a retail store “S” on a weekend. As
Mr./ Ms. “X” enters the store “S” and swipes his/her ID card, his/her shopping
information is automatically picked up and transmitted across the network. This
information could be his/her personal data, shopping
habits. The automated system can even pre-inform special customers
about the arrival of a special brand and prompt them to buy it.
·
Industrial
Analytics (Manufacturing). Preventive
maintenance is one of the examples of how manufacturers can use Big Data.
Breakdown of vehicles can seriously impact all the related processes,
manufacturing, and transportation processes. To mitigate this, vehicle
manufacturers should take timely proactive action. They should suitably
place sensors near the vehicle engine to gather field data and
carry out preventive maintenance. For this, the company should be collecting
and analyzing sensors data for several months to form a history of defects.
Based on this historical data, the Analytic can identify a set of patterns that
are likely to result in a mechanical breakdown. For instance, the system
recognizes that pattern formed by temperature sensors is similar to the
pre-failure situation and alerts the maintenance team to check the machinery
and fix it.
·
Business Process
Analytics (BPA). Today
some companies are already using Big Data Analytics to monitor the performance
of their remote employees, truck drivers and field salesmen and improve
their efficiency. For example, transport companies can collect and store
telemetry data that comes from each truck /car in real-time. BPA can identify
the typical behaviour of each driver. From this data, the company can plan safe
driving conditions for their drivers by enforcing regular and timely halts for
rest.
·
Analytic Frauds
Detection. (AFD). Many lead banks are
already using AFD to detect Credit Card Fraud in real-time and send alert
messages to their registered customer (Actual Card Owner). Suppose you reside
in India (Delhi) and someone is trying to do shopping using your credit card
details in Dubai, the bank can check from your social network if you are
visiting Dubai on that date. Thus, banks can protect their customers from fraud.
·
Healthcare
Analytic (HCA). Big Data Analytics
can help healthcare policymakers to have better decisions and contribute
towards better public healthcare services with greater satisfaction.
COVID-19 pandemic is the best example of many healthcare organizations using
HCA for good decision-making. In August 2020, Covid 19 pandemic
had taken the whole world by surprise and there was no good treatment or vaccine
available to ensure proper health care. Both the healthcare providers and
lawmakers were faced with the difficult task of making good decisions for the
patient care and their organizations. Researchers, service providers, and
policymakers were depending on Big Data Analytics for healthcare to help improve
their procedures and services for patient care, delivery of resources and
preventive health measures. Researchers have also utilized Big Data analytics
tools to forecast possible constraints on hospital capacity and
resources. During this period of utter uncertainty, Big Data Analytic for
healthcare had played a very significant role and effective vaccines were
available by Dec 2020. Big Data Model helped decision-makers to decide on
social distancing measures, school closure policies, testing capacity, corona
contact-tracing strategies and mask-wearing by the population. Through this
analytic model, policymakers could decide when and how to reopen businesses
and schools, and how to distribute a vaccine to various states within their
countries.
Data Analytics for
Insurance. Insurance companies have always depended on Data
Analysis to monitor the satisfaction level of their customers and earn
better revenue for their services. Different types of insurance companies such
as Travel insurance companies, Health, Life insurance companies and
Agriculture Insurance companies rely on statistics to categorize their
customers. Accident statistics, policyholders’ personal information, and
third-party sources help to classify and group people into different risk
categories, prevent fraud losses, and optimize expenses.
The availability of digital platforms has provided new sources of information that can be
used to understand the complex behavioural patterns of a customer and precisely
determine his or her segment. For insurance purposes, big data refers to
unstructured and/or Insurance companies forming their plan of action/business
model on the basic idea of anticipating and diversifying risks. These
companies work to guarantee insurance contracts for uncertain situations.
Indeed, Big Data has revolutionized the insurance industry. Using
BD Analytic allows insurance companies to target its customers more precisely
and achieve higher customer satisfaction. Major activities related
to operations of Insurance Companies, where Big Data Analytic can impact are
briefly given below:
·
Customer
Acquisition. Like every
business, insurance companies need maximum customers to generate maximum
revenue. This requires good outreach to woo customers and take/ retain a bigger
share of the market for your product/ service. Therefore, the process of
acquisition of customers should be made efficient and simpler. The
Customer behaviour data collected from the web is unstructured data and forms
part of Big Data. By using appropriate analytics, insurance companies can create
targeted marketing campaigns that will acquire new customers.
·
Customer Retention. A business is considered to be successful if its
customer retention rate is higher. Insurance companies heavily rely on BD
Analytics to retain their customers. Based on customer activity, algorithms
can predict the early signs of customer dissatisfaction. Based on market
intelligence provided, insurance companies can quickly react to improve their
services and mitigate the grievances of any particular customer. Insurers can
offer discounts or even change the pricing model for the client.
·
Risk
Assessment. Insurance companies
always focus on the verification of customers’ information while assessing the
risks. Big Data technology can efficient of risk assessment process.
·
Fraud Prevention
and Detection. Using predictive
modelling techniques, insurers can compare a person's data against past
fraudulent profiles and identify cases that require more
investigation. Thus, insurance companies can be saved against such frauds.
·
Cost
Reductions. Cost-cutting is one of
the major considerations by any industry. Big data technology can play a
leading role to automate manual processes, making them more efficient and
reducing the costs spent on handling claims and administration. This will allow
the companies to offer lower premiums to their clients and be more
competitive.
·
Personalized
Service and Pricing. We all
like to be treated specially. with personalized services. Life insurance
companies, using big data can become more personalized by looking at the medical
history of a customer. Big data technology allows insurers to work quickly on a
customer’s profile, decide on a suitable risk class, form a pricing model,
automate claims processing, and deliver the best services. A study by
McKinsey shows that automation saves 43% of the time of insurance
employees.
·
Travel
Insurance. Innovations can
simplify and speed up the interaction with customers. Automation of communication
improves customer satisfaction, and facilities speedy offer which appears more
beneficial to the customer.
Jobs
Creation. It is true that Cloud Computing and Big Data environments
are taking away routine jobs of data entry and basic coding but it is also
offering new jobs, though needing a new skill set. This wave is unstoppable and
we all must learn new skills and adopt cloud computing and Big Data as the
future computing environment. It is estimated that India will have a 32% share
of the Big Data global market by 2025. Despite initial apprehension about data
ownership, security and privacy, Cloud Computing and Big Data are the new
business growth engines Therefore, all young professionals must “Get Set and
Go”, to ride this wave of Big Data sweeping across the globe Big Data is the
key for business success of large business organizations. Big Data also offers
many jobs to energetic professionals to become highly sought after and
well-paid Big Data consultants. You need good competency in new skills
like Hadoop, Map-Reduce, Pig, Hive, Mongo DB, Ruby, Java, Python and R.
Statistical analysis using R language can help you in dealing with Business
Intelligence (BI). There are many jobs in the Manufacturing, Banking and
Healthcare sectors, Transportation and Logistics, Ware House management and
many more. Likewise, there are many jobs in Government Departments –
Finance, Metrological, Agriculture, Environment, Labour where Bid Data is
partially or fully adopted.
Big Data Impact Workplaces. Today there
is data deluge since data is in the form of text. images, photographs, videos are
being shared among millions of users across the globe and on a 24x7 basis. Data
is being transacted in an unstructured manner, in various formats,
languages and at a great speed. This data includes Twitter feed, posts and
articles, comments, likes on LinkedIn, Facebook, Instagram, Telegram, sharing
of blogs, posts/articles, and tech publications. As per the latest survey,
Big Data is flowing at a super speed, where there are approximately a
0.5Million comments posted, 0.3 million statuses updated, and 0.14 million
photos uploaded to Facebook every minute, in addition to desktop computers,
laptop computers, hand-held devices like tablets, smartphones, smarts devices,
sensors and IoT can generate many varieties of data. Such data can
be emails text messages video contents, voicemails, tweets, and many other forms
of data. When handled properly, such vast data (Big Data) can help the user in
timely and accurate decision-making. However, this will require efficient
analytical platforms which can handle such large volumes of data. Big
Data will impact the work environment. The impact of Big Data on workplaces is
briefly given below:
·
Customer
Relationship. The existing Customer
Relationship Management (CRM) software does a great job at providing many
departments with an overview of their customers' business profiles. Data from
CRM software can be used to personalize sales processes, target marketing
campaigns, nurture customer relationships, and more. However, CRM
software works particularly well with structured data, such as demographic
information like names, product history, addresses. On the other hand,
big data consists of mostly unstructured data, such as sentiment
analysis from social media networks. While structured data fits well in a
database and can be quickly extracted for analysis. However, unstructured data
needs different handling since it is more diverse and comes at random at
times. The benefit of integrating unstructured big data in your CRM
system means deeper insight into your customers, and uncovering their shopping
frequency, wish-lists and financial capacity. This information can help in
predictive modelling, better customer segmentation, and developing innovative
experiences for the customer. This helps in promoting new products.
·
HR Hiring.
Hard copy (Paper) CVs have been replaced by digital contact forms, which are
convenient to fill in the details, no-cost and have more ways to showcase your
skill-set and immediately reach the hiring agency. Earlier, HR staff had to do
a cumbersome task of manually sifting through all the received
CVs, shortlisting candidates and then sending out initial interview emails/ letters
to the potential candidates. That is why smarter hiring practices are being
adopted by all leading hiring agencies that use Big Data Analytics in
conjunction with AI and ML. With available software, large volumes and
varieties of data on potential candidates can be fed into a neural network.
Neural systems can easily and quickly carry out an in-depth analysis of
personal qualifications, experience, traits, and soft skills. Big data
analytics can track post-hiring performance and highlight which factors lead to
good selection vs bad selection. Thus, Big Data Analytics can take off a
considerable load from HR staff and they can devote their valuable time to
other important tasks. This will lead to increased productivity of HR staff, by
deploying lesser resources for selecting the right candidate. Thus, Big Data
will also help with the retention of good employees, improvement in production,
quality and growth.
·
Decision-Making. New work mantra is “Work Smarter, not Harder.” In
the age of data-driven businesses, working smarter through the use of data
analytics gives you a competitive edge. Research firm IDC estimates that
organizations that make it a priority to discover and analyze relevant data
could generate an extra $430 billion in productivity by 2020. Big data
analytics will only help businesses make smarter decisions faster
– allowing them to find disruptive opportunities the moment they arise. Of
course, this means harnessing real-time data streams, which requires a level of
discerning good quality data from poor quality data.
·
Office Space. Big data and IoT with smart connected devices can
lead to more efficient and innovative office spaces.
·
Facility
optimization. IoT can be used to
save energy. Embedded sensors at key areas throughout the office can detect
which parts of the building are commonly used at which times of the day.
Electrical components can be automated to power on or power off when necessary.
Sensors outside the building can obtain real-time weather analytics to adjust
for changes in temperature – prompting the AC or the heat to kick in.
·
Building
Maintenance This is another trending
use of Big Data and IoT. When certain areas of the building aren’t performing
at their optimum levels, sensors could trigger these responses, and AI/ML will
notify building administrators of the root cause of each problem before any costly
damage occurs.
Issues related to adopting Big Data. Adopting Big
Data in an organization has a number of issues depending upon its business size,
product range, and business alliances. Some of the main issues faced by new
entry are briefly given below:
· Accuracy. Big Data may contain some errors and is not suitable
where absolute accuracy is crucial.
· Applicability. Big Data is not one-fits-all and may not be suitable for
small to medium industries.
· Migration to Big
Data. `Existing organizing using a
client-server environment with dedicated communication network have hesitation
in-migration from existing to new (Big Data) environment. This is now made easy by using software tools like Apache‘s Sqoop which is designed to work with RDBMS like Oracle or MySQL. It
works two-way data exchange from RDBMS, Enterprise data warehouse to HIVE and
HBase, In addition, the service provider helps the customer for
importing/exporting of data.
·
Standardization. Although Big Data has been evolving for nearly 10 years
to meet industry requirements yet there are no Universal Standards. It may
still take another 5-10 years to mature and have universally accepted norms and
industry standards.
·
Security of
Data. Each of the V4 criteria
poses its own challenge when analyzing data. It is the responsibility of
the service provider or data handling organization / IT department to take care
of all technical aspects and ensure that received data is secure, accurate,
consistent and clean.
·
Shortage of experienced professionals. There is an acute shortage of experienced professionals
who can lead the Big Data Project team and meet various requirements in
handling the Big Data environment.
·
Steep Learning Curve.
One option is to train existing manpower in Big Data skills and overcome HR
shortages. However, it takes time to learn and build good competence in handling
Big Data tools.
Summary. In the last ten years, social
media has opened floodgates to the whole community across the world to interact
on a 24x7 basis over 365 days from anywhere to anywhere. There are no
geographical boundaries or time zones or any fixed format/type of data. Big
Data is characterized by V4 (V-Velocity, V-Variety. V- Veracity
and V-Volume), where data could be text, picture, image, video or audio. This
data is unstructured and non-relational, requiring high tech workforce to
pre-process and make it available to the decision-maker in the required
format. Software like Hadoop, MapReduce, Hive, Pig, Ruby, Python, Mongo DB is
the new software tools available to handle, Big Data environment.
Big Data is an
invaluable resource for Business Decision Making, Research work, Health care,
Agriculture. Big Data also helps in carrying out propaganda/election campaigns
by political parties. One good example is an election campaign run by one
country using social media data and transmits false messages to impact the
minds of US people to vote in Favour of their friendly candidate.
Big Data although a very valuable information asset, without applying it to problem-solving, it
does not serve many purposes. As the Digital World continues to
expand, business processes and practices will focus less on acquiring big
volumes of data, but focus more on digging down relevant data for better
decision-making. Big Data needs the support of other technologies like Cloud
Computing, IoT, AI, ML and software tools like Hadoop, MapReduce, Pig, Hive,
Scala, Ruby and Python.
Big Data and Data
Science are helping to evolve new business solutions. Big Data analytics is the most
popular IT trend since 2012. It is gaining widespread application
in many industries, like Banking, Insurance Healthcare, Transportation, Manufacturing and
many government sectors like Departments of Finance, Healthcare, Metrology,
Finance, Agriculture. The trend shows that in the next five years, the
amount of unstructured data available to us will be huge. At the same time,
analytics technology will become more advanced. It will be the solution to
support our smart, fast and digital lifestyle. In the near future, one
may get a notification on his/her smartphone that he/she may soon encounter
health issues. . It will also prescribe him/her suitable medicines. Such
advances in technology are going to impact our lifestyle and human interaction. The
tidal wave of Big Data Technology is unstoppable. It is for the industry and
organizations/institutes to adapt and learn to get the best out of it and start
making better decisions. An early start will be an advantage
while any hesitation or delay will be a loss of business opportunity.
•
Comments