Big data refers to the use of forecast analyses or other advanced methods of utilizing data. Challenges include the analysis, searching, collection, storage, visualization, transmission and security of the data. In many cases, big data is characterized using the “5 Vs”: Variety refers to the different kinds of data, velocity to the creation rate and processing speed, volume to the large amount of data, veracity to the reliability of the data, and value addresses its economic use.
Big Data Visualization
The visualization of the data is an indispensable aspect of big data analyses in order to gain knowledge and benefit from the data. It can be used for a number of applications, ranging from the initial analysis of the data to the creation of hypotheses, experimental validation, and the final display of the results (Godfrey et al. 2016).
The interactive visualization of large data sets in conventional graphs is problematic, as it is not possible to reasonably manage the volume of data points and computing simply takes too long. New methods can solve both problems so that interactive visualization can still be used as a powerful analysis tool (Godfrey et al. 2016).
Image Processing: Video Surveillance
Big data is characterized by the variety of data to be processed, which also includes images and videos. Above all, the challenge is to store and transmit this data.
Private and public video surveillance has significantly increased over the last few years. However, the large amount of data makes it difficult to find the desired information. Big data applications can analyze videos and detect objects and human actions. This in turn makes it possible to automatically monitor large areas and preventatively identify suspicious persons and objects (Ramezani and Yaghmaee 2016; Xu et al. 2016d).
Image Processing: Image Compression
Traditional image compression processes no longer satisfy the demands for the storage and transmission of the large data volumes. Various new approaches are currently in development. For example, images are being compressed by means of their mutual correlations (Zhao et al. 2015).
Internet Of Things: Geo Tags For Individual Travel
This aspect represents a close connection to the innovation trend of the Internet of Things itself. However, the focus is on the processing of the collected data and less on the options to link data.
Using contributions in social media with geo tags, the travel routes of tourists can be recreated. This information can help to shape customer segments and create individual offers for tourists (Bordogna et al. 2016; Su et al. 2016).
Internet of Things: Market Analyses
Big data analyses of customer information, product reviews and sales numbers enable a fast recognition of market changes, for example altered customer needs. This in turn increases the response speed and flexibility of companies (Xu et al. 2016b).
Health: Health sector
The nursing staff in hospitals document a plethora of data in electronic medical files which can be used for prognoses, risk assessments and the analysis of treatment successes. For example, it can be determined how long a transplanted organ is accepted on average or which correlations exist between patient attributes and the progression of illnesses (Westra and Peterson 2016).
During the mass spectrometry of proteins, enormous volumes of data are created which must be saved, processed and shared. The analyses are used to understand more complex biological processes, thereby improving the precision of the models (Deng et al. 2015; Popescu et al. 2016).
Cloud computing is the process of storing and processing data in data centers. Accordingly, the resources must not necessarily be available locally, but need to remain accessible. Cloud computing is of special interest for large data quantities and complex big data calculations (Pietri and Sakellariou 2016).
Before data sets are published, the data must be anonymized in order to protect privacy. For very large data sets, special anonymization processes are required. Otherwise, the powerful big data analyses still allow for an identification of personal data (Kao et al. 2015; Zhang and Xiang 2015).
Workflow systems divide complex big data computations into sub-tasks and determine the order and resource allocation. The application is divided into sub-tasks while the order and resource allocation are performed. This way, the use of resources is optimized, costs are reduced, and the speed of computations increased (Rani and Babu 2015).
Data mining is used to extract phenomena, rules and knowledge from data. A more detailed description can be found in Section 8.9. This process is generally performed in the cloud where the required storage and computing resources are available (Talia 2015; Wang and Zhang 2015).
Service composition refers to the connection of different services within one system. There is a large number of web services with the same functionality but with different quality features, such as costs, response times and reliability. The selection of services is a complex but significant task in order to achieve the desired analysis results (Liu et al. 2015d).
Big Data In Business: Audits
Companies document a lot of data which can serve as foundation for big data analyses. This helps, for example, in uncovering cost reduction potential or identifying quality features (Warren et al. 2015; Ghosh 2015).
Big data improves the results of company audits as the data foundation is significantly larger, making it easier to determine complex relationships through algorithms (Sui and Fang 2015).
Big Data In Science: Libraries
Library customers can be provided with recommendations based on their existing interests and what other customers have read. Furthermore, metadata, such as the level of knowledge or additional documents, can also be taken into consideration (Li and Luo 2015).
Big Data In Business: Controlling
In controlling, big data supports the setup and development of control systems. In financial accounting, it improves the quality and relevance of the financial figures, creating transparency in the process. In reporting, big data can lend support with creating and refining balance sheet guidelines in order to ensure that only useful and current data is used (Warren et al. 2015)..
Big Data In Business: Supply Chain Management
Big data in supply chain management and logistics opens up new options to increase customer benefits and reduce costs. For example, delivery networks can be optimized, dynamic transport routes created, material consumption forecast, or prices set based on data (Ghosh 2015, Ma et al. 2015).
Big Data Algorithms: Extreme Learning Machines
Big data applications require efficient algorithms in order to perform the complex computations within an appropriate timeframe. The biggest problem is the runtime which grows exponentially with the data volume (Papalexakis and Faloutsos 2015).
Big Data Algorithms: Tensor Decomposition
Many big data applications require efficient tensor decomposition. Algorithms optimized for large tensors enable more complex and more precise evaluations with larger data sets (Papalexakis and Faloutsos 2015).
Big Data Algorithms: Fuzzy Logic
Fuzzy logic makes it possible to counter imprecisions, subjective estimations and manipulations in big data applications while improving the quality of analyses. (Lewis and Martin 2015).
Apache Hadoop is an open source Java framework for processing large volumes of data. It consists of two primary components: The file system (HDFS – Hadoop distributed file system) which reliably stores large data volumes, and MapReduce, a programming model which performs parallel and distributed processing of the data. Hadoop is supplemented by YARN (Yet Another Resource Negotiator), a resource manager which performs the distribution of tasks (Gohil et al. 2015; Uzunkaya et al. 2015).