In the Russian-speaking environment it is used as a term Big Data, and the concept of “big data”. The term “big data” is a carbon copy of the English term. Big data does not have a strict definition. It is impossible to draw a clear line - is it 10 terabytes or 10 megabytes? The name itself is very subjective. The word “big” is like “one, two, many” among primitive tribes.

However, there is an established opinion that big data is a set of technologies that are designed to perform three operations. Firstly, process larger volumes of data compared to “standard” scenarios. Secondly, be able to work with quickly arriving data in very large volumes. That is, there is not just a lot of data, but it is constantly becoming more and more. Third, they must be able to work with structured and ill-structured data in parallel in different aspects. Big data assumes that algorithms receive a stream of information that is not always structured and that more than one idea can be extracted from it.

A typical example of big data is information coming from various physical experimental facilities - for example, with, which produces a huge amount of data and does so constantly. The installation continuously produces large volumes of data, and scientists use it to solve many problems in parallel.

The emergence of big data in the public space was due to the fact that this data affected almost all people, and not just the scientific community, where such problems have been solved for a long time. Into the public sphere of technology Big Data came out when we started talking about a very specific number - the number of inhabitants of the planet. 7 billion going to social networks and other projects that aggregate people. YouTube, Facebook, VKontakte, where the number of people is measured in billions, and the number of transactions they perform simultaneously is enormous. The data flow in this case is custom actions. For example, data from the same hosting YouTube, which flow through the network in both directions. Processing means not only interpretation, but also the ability to correctly process each of these actions, that is, put it in the right place and make this data available to each user quickly, since social networks do not tolerate waiting.

Much of what concerns big data, the approaches that are used to analyze it, have actually been around for quite some time. For example, processing images from surveillance cameras, when we are not talking about one picture, but a stream of data. Or robot navigation. All this has existed for decades, but now data processing tasks have affected a much larger number of people and ideas.

Many developers are accustomed to working with static objects and thinking in terms of states. In big data the paradigm is different. You have to be able to work with a constant flow of data, and this is an interesting task. It affects more and more areas.

In our lives, more and more hardware and software are beginning to generate large amounts of data - for example, the Internet of Things.

Things are already generating huge flows of information. The Potok police system sends information from all cameras and allows you to find cars using this data. Fitness bracelets, GPS trackers and other things that serve the needs of individuals and businesses are becoming increasingly fashionable.

The Moscow Department of Informatization is recruiting a large number of data analysts, because a lot of statistics on people are accumulated and they are multi-criteria (that is, statistics on a very large number of criteria have been collected about each person, about each group of people). You need to find patterns and trends in this data. For such tasks, mathematicians with IT education are needed. Because ultimately the data is stored in structured DBMSs, and you need to be able to access them and obtain information.

Previously, we did not consider big data as a problem for the simple reason that there was no place to store it and there were no networks to transmit it. When these opportunities appeared, the data immediately filled the entire volume provided to them. But no matter how much bandwidth and data storage capacity are expanded, there will always be sources, for example, physical experiments, experiments on modeling the streamlining of a wing, which will produce more information than we can transmit. According to Moore's law, the performance of modern parallel computing systems is steadily increasing, and the speeds of data transmission networks are also increasing. However, data must be able to quickly save and retrieve from storage media ( hard drive and other types of memory), and this is another challenge in big data processing.

You know this famous joke, right? Big Data is like sex before 18:

  • everyone thinks about it;
  • everyone talks about it;
  • everyone thinks their friends do it;
  • almost no one does this;
  • whoever does it does it badly;
  • everyone thinks it will work out better next time;
  • no one takes security measures;
  • anyone is ashamed to admit that they don’t know something;
  • if someone succeeds at something, there is always a lot of noise about it.

But let's be honest, with any hype there will always be the usual curiosity: what kind of fuss is there and is there something really important there? In short, yes, there is. Details are below. We have selected for you the most amazing and interesting applications of Big Data technologies. This small market study, using clear examples, confronts us with a simple fact: the future does not come, there is no need to “wait another n years and the magic will become reality.” No, it has already arrived, but is still invisible to the eye and therefore the burning of the singularity has not yet burned a certain point of the labor market so much. Let's go.

1 How Big Data technologies are applied where they originated

Large IT companies are where data science originated, so their internal knowledge in this area is the most interesting. Campaign Google, the birthplace of the Map Reduce paradigm, whose sole purpose is to train its programmers in machine learning technologies. And therein lies their competitive advantage: After acquiring new knowledge, employees will implement new methods in those Google projects where they constantly work. Imagine how huge the list of areas in which a campaign can revolutionize is. One example: neural networks are used .

The corporation implements machine learning in all its products. Its advantage is the presence of a large ecosystem that includes everyone digital devices, used in everyday life. This allows Apple to reach an impossible level: the campaign has more user data than any other. At the same time, the privacy policy is very strict: the corporation has always boasted that it does not use customer data for advertising purposes. Accordingly, user information is encrypted so that Apple lawyers or even the FBI with a warrant cannot read it. By you will find great review Apple's developments in the field of AI.

2 Big Data on 4 wheels

A modern car is an information store: it accumulates all the data about the driver, the environment, connected devices and itself. Soon, a single vehicle connected to a network like this will generate up to 25 GB of data per hour.

Vehicle telematics has been used by automakers for many years, but there is now more lobbying complex method data collection that takes full advantage of Big Data. This means that technology can now notify the driver of bad road conditions by automatic activation anti-lock braking and traction system.

Other companies, including BMW, are using Big Data technology, combined with information collected from prototypes being tested, in-vehicle error memory systems, and customer complaints, to identify model weaknesses early in production. Now, instead of manually evaluating data, which takes months, a modern algorithm is used. Errors and troubleshooting costs are reduced, which speeds up information analysis workflows at BMW.

According to expert estimates, by 2019 the market turnover of connected cars will reach $130 billion. This is not surprising, given the pace of integration by automakers of technologies that are an integral part of the vehicle.

Using Big Data helps make the car safer and more functional. Thus, Toyota by integrating information communication modules (DCM). This Big Data tool processes and analyzes the data collected by DCM to further extract value from it.

3 Application of Big Data in medicine


The implementation of Big Data technologies in the medical field allows doctors to study the disease more thoroughly and choose an effective course of treatment for a particular case. Thanks to the analysis of information, it becomes easier for health workers to predict relapses and take preventive measures. The result is a more accurate diagnosis and improved treatment methods.

The new technique allowed us to look at patients' problems from a different perspective, which led to the discovery of previously unknown sources of the problem. For example, some races are genetically more prone to heart disease than other ethnic groups. Now, when a patient complains of a certain disease, doctors take into account data about members of his race who complained of the same problem. Collection and analysis of data allows us to learn much more about patients: from food preferences and lifestyle to the genetic structure of DNA and metabolites of cells, tissues, and organs. Thus, the Center for Children's Genomic Medicine in Kansas City uses patients and analyzes the mutations in the genetic code that cause cancer. An individual approach to each patient, taking into account his DNA, will raise the effectiveness of treatment to a qualitatively different level.

Understanding how Big Data is used is the first and very important change in the medical field. When a patient undergoes treatment, a hospital or other healthcare facility can obtain a lot of relevant information about the person. The collected information is used to predict disease recurrences with a certain degree of accuracy. For example, if a patient has suffered a stroke, doctors study information about the time of cerebrovascular accident, analyze the intermediate period between previous precedents (if any), paying special attention to stressful situations and heavy physical activity in the patient’s life. Based on this data, hospitals provide the patient with a clear action plan to prevent the possibility of a stroke in the future.

Wearable devices also play a role, helping to identify health problems even if a person does not have obvious symptoms of a particular disease. Instead of assessing the patient’s condition through a long course of examinations, the doctor can draw conclusions based on the information collected by a fitness tracker or smart watch.

One of the latest examples is . While the man was being examined for a new seizure caused by a missed medication, doctors discovered that the man had a much more serious health problem. This problem turned out to be atrial fibrillation. The diagnosis was made thanks to the fact that the department staff gained access to the patient’s phone, namely to the application associated with his fitness tracker. Data from the application turned out to be a key factor in determining the diagnosis, because at the time of the examination, no cardiac abnormalities were detected in the man.

This is just one of the few cases that shows why use big data plays such a significant role in the medical field today.

4 Data analysis has already become the core of retail

Understanding user queries and targeting is one of the largest and most publicized areas of application of Big Data tools. Big Data helps analyze customer habits in order to better understand consumer needs in the future. Companies are looking to expand the traditional data set with information from social networks and browser search history in order to create the most complete customer picture possible. Sometimes large organizations choose to create their own predictive model as a global goal.

For example, the Target store chain, using in-depth data analysis and its own forecasting system, manages to determine with high accuracy - . Each client is assigned an ID, which in turn is linked to a credit card, name or email. The identifier serves as a kind of shopping cart, where information about everything that a person has ever purchased is stored. Network specialists have found that pregnant women actively purchase unscented products before the second trimester of pregnancy, and during the first 20 weeks they rely on calcium, zinc and magnesium supplements. Based on the data received, Target sends coupons for baby products to customers. The discounts on goods for children themselves are “diluted” with coupons for other products, so that offers to buy a crib or diapers do not look too intrusive.

Even government departments have found a way to use Big Data technologies to optimize election campaigns. Some believe that Barack Obama's victory in the 2012 US presidential election was due to the excellent work of his team of analysts, who processed huge amounts of data in the right way.

5 Big Data protects law and order


Over the past few years, law enforcement agencies have been able to figure out how and when to use Big Data. It is a well-known fact that the National Security Agency uses Big Data technologies to prevent terrorist attacks. Other departments are using advanced methodology to prevent smaller crimes.

The Los Angeles Police Department uses . She does what is commonly called proactive policing. Using crime reports over a period of time, the algorithm identifies areas where crime is most likely to occur. The system marks such areas on the city map with small red squares and this data is immediately transmitted to patrol cars.

Chicago cops use Big Data technologies in a slightly different way. Law enforcement officers in the Windy City do the same, but it is aimed at outlining a “risk circle” consisting of people who could be a victim or participant in an armed attack. According to The New York Times, this algorithm assigns a person a vulnerability rating based on his criminal history (arrests and participation in shootings, membership in criminal groups). The system's developer says that while the system examines a person's criminal history, it does not take into account secondary factors such as a person's race, gender, ethnicity and location.

6 How Big Data technologies help cities develop


Veniam CEO Joao Barros shows a map of tracking Wi-Fi routers on Porto buses

Data analysis is also used to improve a number of aspects of the life of cities and countries. For example, knowing exactly how and when to use Big Data technologies, you can optimize traffic flows. To do this, the movement of cars online is taken into account, social media and meteorological data are analyzed. Today, a number of cities have committed themselves to using data analytics to combine transport infrastructure with other types of utilities into a single whole. This is the concept of a “smart” city, in which buses wait for late trains, and traffic lights are able to predict traffic congestion to minimize traffic jams.

Based on Big Data technologies, the city of Long Beach operates smart water meters that are used to stop illegal watering. Previously, they were used to reduce water consumption by private households (the maximum result was a reduction of 80%). Saving fresh water is always a pressing issue. Especially when the state is experiencing the worst drought ever recorded.

Representatives of the Los Angeles Department of Transportation have joined the list of those who use Big Data. Based on data received from traffic camera sensors, authorities monitor the operation of traffic lights, which in turn allows traffic regulation. The computerized system controls about 4,500 thousand traffic lights throughout the city. According to official data, the new algorithm helped reduce congestion by 16%.

7 The engine of progress in marketing and sales


In marketing, Big Data tools make it possible to identify which ideas are most effective in promoting at a particular stage of the sales cycle. Data analysis determines how investments can improve customer relationship management, what strategy should be chosen to increase conversion rates, and how to optimize life cycle client. In cloud businesses, Big Data algorithms are used to figure out how to minimize the cost of customer acquisition and increase customer lifecycle.

Differentiation of pricing strategies depending on the intra-system level of the client is perhaps the main thing for which Big Data is used in the field of marketing. McKinsey found that about 75% of the average firm's revenue comes from core products, 30% of which are mispriced. A 1% increase in price results in an 8.7% increase in operating profit.

The Forrester research team found that data analytics allows marketers to focus on how to make customer relationships more successful. By examining the direction of customer development, specialists can assess the level of their loyalty, as well as extend the life cycle in the context of a specific company.

Optimization of sales strategies and stages of entering new markets using geo-analytics are reflected in the biopharmaceutical industry. According to McKinsey, drug manufacturing companies spend an average of 20 to 30% of profits on administration and sales. If enterprises become more active use Big Data to identify the most profitable and fastest growing markets, costs will be reduced immediately.

Data analytics is a means for companies to gain a complete picture of key aspects of their business. Increasing revenue, reducing costs and reducing working capital are three challenges that modern businesses are trying to solve with the help of analytical tools.

Finally, 58% of marketing directors claim that the implementation of Big Data technologies can be traced to search engine optimization(SEO), e-mail and mobile marketing, where data analysis plays the most significant role in the formation of marketing programs. And only 4% fewer respondents are confident that Big Data will play a significant role in all marketing strategies for many years to come.

8 Global data analysis

No less curious is... It is possible that machine learning will ultimately be the only force capable of maintaining the delicate balance. The topic of human influence on global warming still causes a lot of controversy, so only reliable predictive models based on the analysis of large amounts of data can give an accurate answer. Ultimately, reducing emissions will help us all: we will spend less on energy.

Now Big Data is not an abstract concept that may find its application in a couple of years. This is a completely working set of technologies that can be useful in almost all areas of human activity: from medicine and public order to marketing and sales. The stage of active integration of Big Data into our daily lives has just begun, and who knows what the role of Big Data will be in a few years?

Big data- what is it in simple words

In 2010, the first attempts to solve the growing problem of big data began to appear. Were released software products, whose action was aimed at minimizing risks when using huge amounts of information.

By 2011, large companies such as Microsoft, Oracle, EMC and IBM became interested in big data - they became the first to use Big data developments in their development strategies, and quite successfully.

Universities began studying big data as a separate subject already in 2013 - now not only data science, but also engineering, coupled with computing subjects, are dealing with problems in this area.

The main methods of data analysis and processing include the following:

  1. Class methods or deep analysis (Data Mining).

These methods are quite numerous, but they have one thing in common: the mathematical tools used in conjunction with achievements in the field information technology.

  1. Crowdsourcing.

This technique allows you to obtain data simultaneously from several sources, and the number of the latter is practically unlimited.

  1. A/B testing.

From the entire volume of data, a control set of elements is selected, which is alternately compared with other similar sets where one of the elements was changed. Conducting such tests helps determine which parameter fluctuations have the greatest impact on the control population. Thanks to the volume of Big Data, it is possible to carry out a huge number of iterations, with each of them getting closer to the most reliable result.

  1. Predictive analytics.

Specialists in this field try to predict and plan in advance how the controlled object will behave in order to make the most profitable decision in this situation.

  1. Machine learning (artificial intelligence).

It is based on empirical analysis of information and the subsequent construction of self-learning algorithms for systems.

  1. Network analysis.

The most common method for studying social networks is that after obtaining statistical data, the nodes created in the grid are analyzed, that is, the interactions between individual users and their communities.

Prospects and trends for the development of Big data

In 2017, when big data ceased to be something new and unknown, its importance not only did not decrease, but increased even more. Experts are now betting that big data analytics will become available not only to giant organizations, but also to small and medium-sized businesses. This approach is planned to be implemented using the following components:

  • Cloud storage.

Data storage and processing are becoming faster and more economical - compared to the costs of maintaining your own data center and possible expansion of staff, renting a cloud seems to be a much cheaper alternative.

  • Using Dark Data.

The so-called “dark data” is all non-digitized information about the company, which does not play a key role in its direct use, but can serve as a reason for switching to a new format for storing information.

  • Artificial Intelligence and Deep Learning.

Machine intelligence learning technology, which imitates the structure and operation of the human brain, is ideally suited for processing large amounts of constantly changing information. In this case, the machine will do everything that a person would do, but the likelihood of error is significantly reduced.

  • Blockchain

This technology makes it possible to speed up and simplify numerous online transactions, including international ones. Another advantage of Blockchain is that it reduces transaction costs.

  • Self-service and reduced prices.

In 2017, it is planned to introduce “self-service platforms” - these are free platforms where representatives of small and medium-sized businesses can independently evaluate the data they store and systematize it.

The VISA company similarly used Big Data, tracking fraudulent attempts to perform a particular operation. Thanks to this, they save more than $2 billion annually from leakage.

The German Labor Ministry managed to cut costs by 10 billion euros by introducing a big data system into its work on issuing unemployment benefits. At the same time, it was revealed that a fifth of citizens receive these benefits without reason.

Big Data has not spared the gaming industry either. Thus, the World of Tanks developers conducted a study of information about all players and compared the available indicators of their activity. This helped predict the possible future outflow of players - based on the assumptions made, representatives of the organization were able to interact more effectively with users.

Notable organizations using big data also include HSBC, Nasdaq, Coca-Cola, Starbucks and AT&T.

Big Data problems

The biggest problem with big data is the cost of processing it. This can include both expensive equipment and costs wages qualified specialists capable of handling huge amounts of information. Obviously, the equipment will have to be updated regularly so that it does not lose minimum functionality as the volume of data increases.

The second problem is again related to the large amount of information that needs to be processed. If, for example, a study produces not 2-3, but a numerous number of results, it is very difficult to remain objective and select from the general flow of data only those that will have a real impact on the state of any phenomenon.

Big Data privacy problem. With most customer service services moving to online data usage, it is very easy to become the next target for cybercriminals. Even simply storing personal information without making any online transactions can be fraught with undesirable consequences for customers cloud storage consequences.

The problem of information loss. Precautionary measures require not to be limited to a simple one-time data backup, but to do at least 2-3 backups storage facilities. However, as the volume increases, the difficulties with redundancy increase - and IT specialists are trying to find the optimal solution to this problem.

Big data technology market in Russia and the world

As of 2014, 40% of the big data market volume is made up of services. Revenue from the use of Big Data in computer equipment is slightly inferior (38%) to this indicator. The remaining 22% comes from software.

The most useful products in the global segment for solving Big Data problems, according to statistics, are In-memory and NoSQL analytical platforms. 15 and 12 percent of the market, respectively, are occupied by Log-file analytical software and Columnar platforms. But Hadoop/MapReduce in practice cope with big data problems not very effectively.

Results of implementing big data technologies:

  • increasing the quality of customer service;
  • optimization of supply chain integration;
  • optimization of organization planning;
  • acceleration of interaction with clients;
  • increasing the efficiency of processing customer requests;
  • reduction in service costs;
  • optimization of processing client requests.

Best books on Big Data

"The Human Face of Big Data" by Rick Smolan and Jennifer Erwitt

Suitable for initial study of big data processing technologies - it introduces you easily and clearly. Makes it clear how the abundance of information has influenced everyday life and all its spheres: science, business, medicine, etc. Contains numerous illustrations, so it is perceived without much effort.

"Introduction to Data Mining" by Pang-Ning Tan, Michael Steinbach and Vipin Kumar

Also useful for beginners is a book on Big Data, which explains working with big data according to the principle “from simple to complex.” Covers many important points at the initial stage: preparation for processing, visualization, OLAP, as well as some methods of data analysis and classification.

"Python Machine Learning" by Sebastian Raschka

A practical guide to using and working with big data using the language Python programming. Suitable for both engineering students and professionals who want to deepen their knowledge.

"Hadoop for Dummies", Dirk Derus, Paul S. Zikopoulos, Roman B. Melnik

Hadoop is a project created specifically for working with distributed programs that organize the execution of actions on thousands of nodes simultaneously. Getting to know it will help you understand in more detail the practical application of big data.

Yulia Sergeevna Volkova, 4th year student, Financial University under the Government Russian Federation, Kaluga branch, Kaluga [email protected]

Big Data in the modern world

Abstract. The article is devoted to the implementation of Big Data technologies in our modern society. The main characteristics of Big Data are studied, the main areas of application, such as banking, retail, the private and public sectors, and even everyday life, are considered. The study revealed the disadvantages of using Big Data technologies. The need to develop regulatory regulation of the use of Big Data is outlined. Key words: Big Data, banks, banking sector, retail, private sector, public sector.

As the degree of integration of information technology tools into various areas increases modern society The requirements for their adaptability to solve new problems that require huge amounts of data are also increasing. There are volumes of information that cannot be processed in traditional ways, including structured data, media data and random objects. And if the technologies existing today more or less cope with the analysis of the first, then the analysis of the second and third practically remains an overwhelming task. Research shows that the volume of media data, such as video surveillance, aerial photography, digital health information, and random objects stored in numerous archives and clouds, is increasing year by year. The huge volume of data has become a global process and is called Big Data. The works of both foreign and Russian scientists are devoted to the study of Big Data: James Manyika, Michael Chui, Toporkov V.V., Budzko V.I. Large global companies such as McKinsey& Company, СNews Analytics, SAP, Oracle, IBM, Microsoft, Teradata and many others make a significant contribution to the study of this technology. They engage in data processing and analysis and create software and hardware systems based on Big Data. According to a McKinsey Institute report: “Big Data is a set of data whose size exceeds the capabilities of typical database software tools for capturing, storing, managing and analyzing data.” In essence, the concept of big data involves working with information of a huge volume and diverse composition, constantly updated and located in different sources in order to increase operational efficiency, create new products and increase competitiveness. The consulting company Forrester gives a brief and fairly clear formulation: “Big data combines techniques and technologies that extract meaning from data at the extreme limit of practicality.” Today, the field of Big Data is characterized by the following features: Volume – volume, the accumulated database represents a large amount of information .Velocity—speed, this attribute indicates an increasing rate of data accumulation (90% of the information was collected over the last 2 years). Variety—diversity, i.e. the ability to simultaneously process structured and unstructured information of various formats. Marketing experts love to add their “V” here. Some people also talk about veracity, others add that big data technologies must certainly bring benefits to business (value). It is expected that by 2020, the accumulated amount of information on the planet will double every two years. The abundance of data makes you want to use it for analysis and forecasting. Enormous volumes require appropriate technologies. Today, companies must process colossal amounts of data in volumes that are difficult to imagine, this leads to the fact that traditional databases cannot cope with such a task, and this leads to the need to implement Big Data technologies. The table shows the comparative characteristics of Big Data and traditional databases. The basis for the formation of this table was research by V. I. Budzko and the Moscow Exchange. Table 1 Comparative characteristics of big data and traditional data

Traditional DatabasesBig DataApplication Area

One or more subject areas of application The scope of Big Data technologies is vast. From identifying customer preferences to risk analysis Data characteristics Only structured data Huge amounts of information with a complex heterogeneous and/or uncertain structure Data storage method Centralized Decentralized Data storage and processing model Vertical model Horizontal model Amount of information for processing From gigabytes (109 bytes) to terabytes (1012 bytes) From petabytes (1015 bytes) to exabytes (1018 bytes) IT ) Thus, the scope of traditional databases covers only one or several, and such areas should contain only structured data. As for Big Data, the scope of its application is extensive with huge amounts of information with a complex structure. According to the results of the CNews Analytics study presented in Figure 1, the Russian market is coming to such a phenomenon as Big Data, which shows an increase in the level of maturity of companies. Many companies are switching to Big Data technologies due to the volume of their data processed; already more than 44% generate about 100 terabytes, and 13% have data volumes exceeding 500 terabytes.

Fig.1. Volumes of information processed in companies

Such volumes cannot be processed by traditional databases, so such companies see the solution to switching to Big Data not just as processing huge volumes, but also as increasing competitiveness, increasing customer loyalty to their product and attracting new ones. The most active customers of such solutions are banks, telecoms and retail; their percentages are presented in Figure 2. The number of companies that use or are ready to use big data in the transport, energy, and industrial sectors is less noticeable. The first examples of the use of big data also appeared in the public sector.

Fig.2. Industry structure of Big Data use

As for the Western government, various estimates put the digital economy at between 3% and 21% of the G20 countries' GDP. The Russian public sector has not yet achieved significant results in working with big data. Today in Russia, mainly commercial enterprises are interested in such technologies: retail chains, banks, telecommunications companies. According to the Russian Association of Electronic Communications, the volume of the digital economy in the Russian Federation is only 1 trillion. rub. -about 1.5% of GDP. However, the Russian Federation has enormous potential for growth in the digital economy. Despite the short existence of the Big Data sector, there are already assessments of the effective use of these technologies based on real examples. Banks today process on average approximately 3.8 petobytes of data, they use Big Data technologies to achieve certain tasks:  collection of data on the use of credit cards;  collection of data on collateral;  collection of data on loans; 44% 16% 13% 7% 20%BanksTelecomRetailPublic SectorOtherscollection of data about customer profiles;Collection of data about customer savings.Banks claim that after they began to use Big Data technologies, they were able to attract new customers, interact better with both new and old customers and maintain their loyalty. In 2015, CNews Analytics conducted a survey among the thirty largest Russian banks by total assets to find out what big data technologies they use and for what purposes. Compared to the 2014 survey, the number of top 30 banks reporting the use of big data technologies has increased, but this change is more likely due to a change in the composition of the top 30. Figure 3 shows a comparison of the 2015 survey with that of 2014 based on a survey by A. Kiryanova.

Rice. 3. Use of Big Data by top30 Russian banks

According to IBS estimates, 80% of banks that responded positively are implementing Big Data Appliance—software and hardware systems for storing and processing data. These solutions usually act as analytical or transactional storage, the main advantage of which is high performance when working with large volumes of data. However, the practice of using big data in Russian banks is in its infancy. The reason for such slow adaptation in Russia is manifested in the wary attitude of customer IT specialists to new technologies. They are not confident that big data technologies will help solve problems in full. But as for the American market, banks there have already accumulated 1 exabyte of data, which can be compared to 275 billion mp3 records. The number of sources from which information comes is vast, of which the classic ones can be distinguished:  visits to bank clients' offices;  recordings of telephone calls;  client behavior on social networks;  information about credit card transactions  and others. Offline retail uses big data to analyze customer behavior, design routes around the sales floor, correctly arrange goods, plan purchases, and, ultimately, increase sales. In online retail, the sales mechanism itself is built on big data: users are offered products based on previous purchases and their personal preferences, information about which is collected, for example, on social networks. In both cases, big data analysis helps reduce costs, increase customer loyalty and reach a larger audience. As companies develop their trading potential, traditional databases no longer meet growing business requirements, which is why the system cannot provide adequate detail in management accounting. By switching to big data, new technologies make it possible to optimize the management of product distribution, achieve the relevance of data and the speed of their processing to assess the consequences of management decisions, and quickly generate management reporting. The total volume of accumulated data is more than 100 exabytes, while Walmart alone processes 2.5 Petabytes of data per hour using big data. Moreover, from the use of Big Data technologies, operating profitability increases by 60%, and also, according to Hadoop statistics, after the implementation of Big Data, analytics productivity increases to the processing of 120 algorithms, and profits grow by 710%. But if we take into account Russian retail, then there are Big Data is just starting to gain momentum as the information processing gap is very different. For example, online retail is 18 times less than in China, and the entire data turnover that is produced in online retail is 4.5 times less than one Amazon store. At the same time, the number of online stores in Russia that use Big Data is less than 40 thousand, while in Europe the number of such stores is more than 550 thousand. What characterizes the Russian retail market as still developing and not fully formed. As for our daily life, Big Data technologies are used here, which we have not even thought about. 15 million songs every day, which is approximately 1.5~2 petabytes, are processed by shazam, a music service, around the world, and based on Then music producers predict the artist’s popularity. Big data is also used to process information on credit cards such as mastercard and visa. Thus, 65 billion transactions per year using 1.9 billion cards at 32 million merchants are processed by mastercard to predict trade trends. Every day, people around the world post 19 terabytes of data on social networks such as twitter and facebook. They download and process photos, write, send messages, and so on. Infrastructure also uses Big Data technologies, from trolleybuses to airplanes and rockets. Thus, in the London Underground, turnstiles record about 20 million passes every day; as a result of an analysis carried out on the basis of Big Data technologies, 10 possible epicenters were identified, which is also taken into account in the further development of the metro. Undoubtedly, the variety and volume of data resulting from all kinds of interactions is a powerful basis for business to build and refine forecasts, identify patterns, evaluate performance, etc. However, everything has its own disadvantages, which also need to be carefully taken into account. Despite the obvious and potential advantages of using Big Data, their use also has its disadvantages, which are primarily associated with large volumes of information, different methods of accessing it and often insufficient resource support information security functions in organizations. The problems associated with the use of Big Data technologies are presented in Figure 4.

Rice. 4. Problems of using Big Data

All these problems lead to the fact that many companies are wary of introducing big data technologies, since when working with third parties they themselves have the problem of disclosing inside information that the company could not disclose using only its own resources. In my opinion, the most important step On the path to the full implementation of technologies based on big data, there must be a legislative aspect. There are already laws limiting the collection, use, and storage of certain types of personal data, but they do not completely limit big data, so there must be special legislation for it. In order to comply with rapidly changing and new laws, companies must carry out an initial inventory of relevant regulations and update this list on a regular basis. However, despite all the above shortcomings, as the experience of Western representatives shows, Big Data technologies help to successfully solve , both modern business tasks and increasing competitiveness, and tasks related directly to people’s lives. Russian companies are already on the path of implementing Big Data technologies both in the production sphere and in the public sphere, as the amount of information almost doubles every year. Over time, many areas of our lives will be changed by Big Data.

Links to sources 1. BudzkoV. I. High availability systems and Big Data // Big Data in the National Economy 2013. P. 1619.2. Korotkova T. “EMC Data Lake 2.0 - a means of transition to big data analytics and the digital economy” http://bigdata.cnews.ru/ News/Line/20151203_EMC_DATA_LAKE_20_POMOZHET_PEREJTI_K_ANALITIKE.3. Kiryanova A. "Big data did not become Maynstam in Russian banks" http://www.cnews/top/bolshie_dannye_ne_ne Mejnstrimom.4.cnews “Infographics: Big data came to Russia” http http://bigdata.cnews.ru/articles/infografika_bolshie_dannye_prishli_v_rossiyu.5.CNews “Infographics: How retail uses big data” http://bigdata.cnews.ru/articles/infografika_kak_roznitsa_ispolzuet there are no special legislative provisions in the world regarding Big Data, data should be masked to protect the original data sources companies must be confident that all data security requirements are monitored and supported Implementation of Big Data solutions may result in the creation or discovery of previously confidential information Data management Maintaining data security requirements Legal regulations Risk identification 6.CNews "Infographics" : BigData Technologies" http://bigdata.cnews.ru/articles/big_data_v_zhizni_cheloveka.7.CNews"Infographics: What big data can do in banks" http://bigdata.cnews.ru/articles/infografika_chto_mogut_bolshie_dannye.8.Moscow Exchange " Analytical review of the BigData market" http://habrahabr.ru/company/moex/blog/256747/9. Big Data (BigData). http://www.tadviser.ru/index.php/Article: Big_Data_(Big_Data).10.BigData – electricity of the XXI century http://bit.samag.ru/archive/article/1463.11.McKinsey Global institute “Bigdata: The next frontier for innovation, competitionand productivity" (June 2011).

It was predicted that the total global volume of data created and replicated in 2011 could be about 1.8 zettabytes (1.8 trillion gigabytes) - about 9 times more than what was created in 2006.

More complex definition

However` big data` involve more than just analyzing huge amounts of information. The problem is not that organizations create huge amounts of data, but that most of it is presented in a format that does not fit well with the traditional structured database format - web logs, videos, text documents, machine code or, for example, geospatial data. All this is stored in many different repositories, sometimes even outside the organization. As a result, corporations can have access to huge amounts of their data and not have necessary tools to establish relationships between these data and draw meaningful conclusions based on them. Add to this the fact that data is now being updated more and more frequently, and you get a situation in which traditional methods of information analysis cannot keep up with the huge volumes of constantly updated data, which ultimately opens the way for technology big data.

Best definition

In essence the concept big data involves working with information of a huge volume and diverse composition, very often updated and located in different sources in order to increase operational efficiency, create new products and increase competitiveness. The consulting company Forrester gives a brief formulation: ` Big Data brings together techniques and technologies that extract meaning from data at the extreme limits of practicality.

How big is the difference between business analytics and big data?

Craig Bathy, executive director of marketing and chief technology officer of Fujitsu Australia, pointed out that business analysis is a descriptive process of analyzing the results achieved by a business in a certain period of time, while the processing speed big data allows you to make the analysis predictive, capable of offering business recommendations for the future. Big data technologies also allow you to analyze more types of data than business intelligence tools, which makes it possible to focus on more than just structured repositories.

Matt Slocum of O'Reilly Radar believes that although big data and business analytics have the same goal (finding answers to a question), they differ from each other in three aspects.

  • Big data is designed to handle larger volumes of information than business analytics, and this certainly fits the traditional definition of big data.
  • Big data is designed to handle faster, faster-changing information, which means deep exploration and interactivity. In some cases, results are generated faster than the web page loads.
  • Big data is designed to process unstructured data that we are only beginning to explore how to use once we have been able to collect and store it, and we need algorithms and conversational capabilities to make it easier to find trends contained within these data sets.

According to the white paper "Oracle Information Architecture: An Architect's Guide to Big Data" published by Oracle, when working with big data, we approach information differently than when conducting business analysis.

Working with big data is not like the usual business intelligence process, where simply adding up known values ​​produces a result: for example, adding up paid invoices results in annual sales. When working with big data, the result is obtained in the process of cleaning it through sequential modeling: first, a hypothesis is put forward, a statistical, visual or semantic model is built, on its basis the accuracy of the put forward hypothesis is checked, and then the next one is put forward. This process requires the researcher to either interpret visual meanings or construct interactive queries based on knowledge, or develop adaptive `machine learning` algorithms that can produce the desired result. Moreover, the lifetime of such an algorithm can be quite short.

Big data analysis techniques

There are many different methods for analyzing data sets, which are based on tools borrowed from statistics and computer science (for example, machine learning). The list does not pretend to be complete, but it reflects the most popular approaches in various industries. It should be understood that researchers continue to work on creating new techniques and improving existing ones. In addition, some of the techniques listed do not necessarily apply exclusively to big data and can be successfully used for smaller arrays (for example, A/B testing, regression analysis). Of course, the more voluminous and diversified the array is analyzed, the more accurate and relevant data can be obtained as a result.

A/B testing. A technique in which a control sample is alternately compared with others. Thus, it is possible to identify the optimal combination of indicators to achieve, for example, the best consumer response to a marketing offer. Big Data allow you to carry out a huge number of iterations and thus obtain a statistically reliable result.

Association rule learning. A set of techniques for identifying relationships, i.e. association rules between variables in large data sets. Used in data mining.

Classification. A set of techniques that allows you to predict consumer behavior in a certain market segment (purchase decisions, churn, consumption volume, etc.). Used in data mining.

Cluster analysis. A statistical method for classifying objects into groups by identifying previously unknown common features. Used in data mining.

Crowdsourcing. Methodology for collecting data from a large number of sources.

Data fusion and data integration. A set of techniques that allows you to analyze comments from social network users and compare them with sales results in real time.

Data mining. A set of techniques that allows you to determine the categories of consumers most susceptible to the promoted product or service, identify the characteristics of the most successful employees, and predict the behavioral model of consumers.

Ensemble learning. This method uses many predictive models, thereby improving the quality of the forecasts made.

Genetic algorithms. In this technique possible solutions represented as 'chromosomes' that can combine and mutate. As in the process of natural evolution, the fittest individual survives.

Machine learning. A direction in computer science (historically it has been given the name “artificial intelligence”), which pursues the goal of creating self-learning algorithms based on the analysis of empirical data.

Natural language processing (NLP). A set of techniques for recognizing natural human language borrowed from computer science and linguistics.

Network analysis. A set of techniques for analyzing connections between nodes in networks. In relation to social networks, it allows you to analyze the relationships between individual users, companies, communities, etc.

Optimization. A set of numerical methods for redesigning complex systems and processes to improve one or more metrics. Helps in making strategic decisions, for example, the composition of the product line to be launched on the market, conducting investment analysis, etc.

Pattern recognition. A set of techniques with self-learning elements for predicting the behavioral model of consumers.

Predictive modeling. A set of techniques that allow you to create a mathematical model of a predetermined probable scenario for the development of events. For example, analysis of the CRM system database for possible conditions that will prompt subscribers to change providers.

Regression. A set of statistical methods for identifying a pattern between changes in a dependent variable and one or more independent variables. Often used for forecasting and predictions. Used in data mining.

Sentiment analysis. Techniques for assessing consumer sentiment are based on natural language recognition technologies. They allow you to isolate messages related to the subject of interest (for example, a consumer product) from the general information flow. Next, evaluate the polarity of the judgment (positive or negative), the degree of emotionality, etc.

Signal processing. A set of techniques borrowed from radio engineering that aims to recognize a signal against a background of noise and its further analysis.

Spatial analysis. A set of methods for analyzing spatial data, partly borrowed from statistics - terrain topology, geographic coordinates, object geometry. Source big data in this case they often appear geographic information systems(GIS).

  • Revolution Analytics (based on the R language for mathematical statistics).

Of particular interest on this list is Apache Hadoop, an open source software that has been proven as a data analyzer by most stock trackers over the past five years. As soon as Yahoo opened the Hadoop code to the open source community, a whole movement of creating products based on Hadoop immediately appeared in the IT industry. Almost everything modern means analysis big data provide Hadoop integration tools. Their developers are both startups and well-known global companies.

Markets for Big Data Management Solutions

Big Data Platforms (BDP, Big Data Platform) as a means of combating digital hording

Ability to analyze big data, colloquially called Big Data, is perceived as a benefit, and unambiguously. But is this really so? What could the rampant accumulation of data lead to? Most likely to what domestic psychologists, in relation to humans, call pathological hoarding, syllogomania, or figuratively “Plyushkin syndrome.” In English, the vicious passion to collect everything is called hording (from the English hoard - “stock”). According to the classification of mental illnesses, hording is classified as a mental disorder. In the digital era, digital hoarding is added to the traditional material hording; it can affect both individuals and entire enterprises and organizations ().

World and Russian market

Big data Landscape - Main suppliers

Interest in collection, processing, management and analysis tools big data Almost all leading IT companies showed this, which is quite natural. Firstly, they directly encounter this phenomenon in their own business, and secondly, big data open up excellent opportunities for developing new market niches and attracting new customers.

Many startups have appeared on the market that make business by processing huge amounts of data. Some of them use ready-made cloud infrastructure provided by large players like Amazon.

Theory and practice of Big Data in industries

History of development

2017

TmaxSoft forecast: the next “wave” of Big Data will require modernization of the DBMS

Enterprises know that the vast amounts of data they accumulate contain important information about their business and clients. If a company can successfully apply this information, it will have a significant advantage over its competitors and will be able to offer better products and services than theirs. However, many organizations still fail to effectively use big data due to the fact that their legacy IT infrastructure is unable to provide the necessary storage capacity, data exchange processes, utilities and applications required to process and analyze large amounts of unstructured data to extract valuable information from them, TmaxSoft indicated.

Additionally, the increased processing power needed to analyze ever-increasing volumes of data may require significant investment in an organization's legacy IT infrastructure, as well as additional maintenance resources that could be used to develop new applications and services.

On February 5, 2015, the White House released a report that discussed how companies are using " big data» to charge different prices to different customers, a practice known as “price discrimination” or “personalized pricing”. The report describes the benefits of big data for both sellers and buyers, and its authors conclude that many of the issues raised by big data and differential pricing can be addressed through existing anti-discrimination laws and regulations. protecting consumer rights.

The report notes that at this time there is only anecdotal evidence of how companies are using big data in the context of personalized marketing and differentiated pricing. This information shows that sellers use pricing methods that can be divided into three categories:

  • study of the demand curve;
  • Steering and differentiated pricing based on demographic data; And
  • targeted behavioral marketing (behavioral targeting) and individualized pricing.

Studying the Demand Curve: To determine demand and study consumer behavior, marketers often conduct experiments in this area in which customers are randomly assigned to one of two possible price categories. “Technically, these experiments are a form of differential pricing because they result in different prices for customers, even if they are “non-discriminatory” in the sense that all customers have the same probability of being “sent” to a higher price.”

Steering: It is the practice of presenting products to consumers based on their membership in a specific demographic group. For example, a computer company's website may offer the same laptop to different types of customers at different prices based on their self-reported information (for example, depending on whether the user is a government, academic, or commercial user, or an individual) or on their geographical location (for example, determined by the IP address of a computer).

Targeted behavioral marketing and customized pricing: In these cases, customers' personal information is used to target advertising and customize pricing for certain products. For example, online advertisers use data collected by advertising networks and through third-party cookies about online user activity to target their advertisements. This approach, on the one hand, allows consumers to receive advertising of goods and services of interest to them. It may, however, cause concern for those consumers who do not want certain types of their personal data (such as information about visits to websites linked to medical and financial matters) were collected without their consent.

Although targeted behavioral marketing is widespread, there is relatively little evidence of personalized pricing in the online environment. The report speculates that this may be because the methods are still being developed, or because companies are hesitant to use custom pricing (or prefer to keep quiet about it) - perhaps fearing a backlash from consumers.

The report's authors suggest that "for the individual consumer, the use of big data clearly presents both potential rewards and risks." While acknowledging that big data raises transparency and discrimination issues, the report argues that existing anti-discrimination and consumer protection laws are sufficient to address them. However, the report also highlights the need for “ongoing oversight” when companies use sensitive information in ways that are not transparent or in ways that are not covered by existing regulatory frameworks.

This report continues the White House's efforts to examine the use of big data and discriminatory pricing on the Internet and the resulting consequences for American consumers. It was previously reported that the White House Big Data Working Group published its report on this issue in May 2014. The Federal Trade Commission (FTC) also addressed these issues during its September 2014 workshop on big data discrimination.

2014

Gartner dispels myths about Big Data

A fall 2014 research note from Gartner lists a number of common Big Data myths among IT leaders and provides rebuttals to them.

  • Everyone is implementing Big Data processing systems faster than us

Interest in Big Data technologies is at an all-time high: 73% of organizations surveyed by Gartner analysts this year are already investing in or planning to do so. But most of these initiatives are still in the very early stages, and only 13% of respondents have already implemented such solutions. The most difficult thing is to determine how to extract income from Big Data, to decide where to start. Many organizations get stuck in the pilot stage because they cannot tie the new technology to specific business processes.

  • We have so much data that there is no need to worry about small errors in it

Some IT managers believe that small data flaws do not affect the overall results of analyzing huge volumes. When there is a lot of data, each individual error actually has less of an impact on the result, analysts note, but the errors themselves also become more numerous. In addition, most of the analyzed data is external, of unknown structure or origin, so the likelihood of errors increases. So in the world of Big Data, quality is actually much more important.

  • Big Data technologies will eliminate the need for data integration

Big Data promises the ability to process data in its original format, with automatic schema generation as it is read. It is believed that this will allow information from the same sources to be analyzed using multiple data models. Many believe that this will also enable end users to interpret any data set as they see fit. In reality, most users often want the traditional way with a ready-made schema, where the data is formatted appropriately and there are agreements on the level of integrity of the information and how it should relate to the use case.

  • There is no point in using data warehouses for complex analytics

Many information management system administrators believe that there is no point in spending time creating a data warehouse, given that complex analytical systems rely on new types of data. In fact, many complex analytics systems use information from a data warehouse. In other cases, new types of data need to be additionally prepared for analysis in Big Data processing systems; decisions have to be made about the suitability of the data, the principles of aggregation and the required level of quality - such preparation may occur outside the warehouse.

  • Data warehouses will be replaced by data lakes

In reality, vendors mislead customers by positioning data lakes as a replacement for storage or as critical elements of the analytical infrastructure. Underlying data lake technologies lack the maturity and breadth of functionality found in warehouses. Therefore, managers responsible for data management should wait until lakes reach the same level of development, according to Gartner.

Accenture: 92% of those who implemented big data systems are satisfied with the results

Among the main advantages of big data, respondents named:

  • “searching for new sources of income” (56%),
  • “improving customer experience” (51%),
  • “new products and services” (50%) and
  • “an influx of new customers and maintaining the loyalty of old ones” (47%).

When introducing new technologies, many companies faced traditional problems. For 51%, the stumbling block was security, for 47% - the budget, for 41% - the lack of necessary personnel, and for 35% - difficulties in integrating with existing system. Almost all companies surveyed (about 91%) plan to soon solve the problem of staff shortages and hire big data specialists.

Companies are optimistic about the future of big data technologies. 89% believe they will change business as much as the Internet. 79% of respondents noted that companies that do not engage in big data will lose their competitive advantage.

However, respondents disagreed about what exactly should be considered big data. 65% of respondents believe that these are “large data files”, 60% believe that this is “advanced analytics and analysis”, and 50% believe that this is “data visualization tools”.

Madrid spends €14.7 million on big data management

In July 2014, it became known that Madrid would use big data technologies to manage city infrastructure. The cost of the project is 14.7 million euros, the basis of the implemented solutions will be technologies for analyzing and managing big data. With their help, the city administration will manage work with each service provider and pay accordingly depending on the level of services.

We are talking about administration contractors who monitor the condition of streets, lighting, irrigation, green spaces, clean up the territory and remove, as well as waste recycling. During the project, 300 key performance indicators of city services were developed for specially designated inspectors, on the basis of which 1.5 thousand various checks and measurements will be carried out daily. In addition, the city will begin using an innovative technology platform called Madrid iNTeligente (MiNT) - Smarter Madrid.

2013

Experts: Big Data is in fashion

Without exception, all vendors in the data management market are currently developing technologies for Big Data management. This new technological trend is also actively discussed by the professional community, both developers and industry analysts and potential consumers of such solutions.

As Datashift found out, as of January 2013, there was a wave of discussions around “ big data"exceeded all imaginable dimensions. After analyzing the number of mentions of Big Data on social networks, Datashift calculated that in 2012 the term was used about 2 billion times in posts created by about 1 million different authors around the world. This is equivalent to 260 posts per hour, with a peak of 3,070 mentions per hour.

Gartner: Every second CIO is ready to spend money on Big data

After several years of experimentation with Big data technologies and the first implementations in 2013, the adaptation of such solutions will increase significantly, Gartner predicts. Researchers surveyed IT leaders around the world and found that 42% of respondents have already invested in Big data technologies or plan to make such investments within the next year (data as of March 2013).

Companies are forced to spend money on processing technologies big data, since the information landscape is rapidly changing, we require new approaches to information processing. Many companies have already realized that large amounts of data are critical, and working with them allows them to achieve benefits that are not available using traditional sources of information and methods of processing it. In addition, the constant discussion of the topic of “big data” in the media fuels interest in relevant technologies.

Frank Buytendijk, a vice president at Gartner, even called on companies to tone down their efforts as some worry they are falling behind competitors in their adoption of Big Data.

“There is no need to worry; the possibilities for implementing ideas based on big data technologies are virtually endless,” he said.

Gartner predicts that by 2015, 20% of Global 1000 companies will have a strategic focus on “information infrastructure.”

In anticipation of the new opportunities that big data processing technologies will bring, many organizations are already organizing the process of collecting and storing various types of information.

For educational and government organizations, as well as industrial companies, the greatest potential for business transformation lies in the combination of accumulated data with so-called dark data (literally, “dark data”), the latter includes messages email, multimedia and other similar content. According to Gartner, the winners in the data race will be those who learn to deal with a variety of sources of information.

Cisco survey: Big Data will help increase IT budgets

The Spring 2013 Cisco Connected World Technology Report, conducted in 18 countries by independent research firm InsightExpress, surveyed 1,800 college students and an equal number of young professionals between the ages of 18 and 30. The survey was conducted to find out the level of readiness of IT departments to implement projects Big Data and gain insight into the challenges involved, technological shortcomings and strategic value of such projects.

Most companies collect, record and analyze data. However, the report says, many companies face a range of complex business and information technology challenges with Big Data. For example, 60 percent of respondents admit that Big Data solutions can improve decision-making processes and increase competitiveness, but only 28 percent said that they are already receiving real strategic benefits from the accumulated information.

More than half of the IT executives surveyed believe that Big Data projects will help increase IT budgets in their organizations, as there will be increased demands on technology, personnel and professional skills. At the same time, more than half of respondents expect that such projects will increase IT budgets in their companies as early as 2012. 57 percent are confident that Big Data will increase their budgets over the next three years.

81 percent of respondents said that all (or at least some) Big Data projects will require the use of cloud computing. Thus, the spread cloud technologies may impact the adoption rate of Big Data solutions and the business value of those solutions.

Companies collect and use data from the most different types, both structured and unstructured. Here are the sources from which survey participants receive their data (Cisco Connected World Technology Report):

Nearly half (48 percent) of IT leaders predict the load on their networks will double over the next two years. (This is especially true in China, where 68 percent of respondents share this view, and in Germany – 60 percent). 23 percent of respondents expect network load to triple over the next two years. At the same time, only 40 percent of respondents declared their readiness for explosive growth in network traffic volumes.

27 percent of respondents admitted that they need better IT policies and information security measures.

21 percent need more bandwidth.

Big Data opens up new opportunities for IT departments to add value and build strong relationships with business units, allowing them to increase revenue and strengthen the company's financial position. Big Data projects make IT departments a strategic partner to business departments.

According to 73 percent of respondents, the IT department will become the main driver of the implementation of the Big Data strategy. At the same time, respondents believe that other departments will also be involved in the implementation of this strategy. First of all, this concerns the departments of finance (named by 24 percent of respondents), research and development (20 percent), operations (20 percent), engineering (19 percent), as well as marketing (15 percent) and sales (14 percent).

Gartner: Millions of new jobs needed to manage big data

Global IT spending will reach $3.7 billion by 2013, which is 3.8% more than spending on information technology in 2012 (year-end forecast is $3.6 billion). Segment big data(big data) will develop at a much faster pace, says a Gartner report.

By 2015, 4.4 million jobs in information technology will be created to service big data, of which 1.9 million jobs will be in . Moreover, each such job will entail the creation of three additional jobs outside of the IT sector, so that in the United States alone, 6 million people will work to support the information economy in the next four years.

According to Gartner experts, the main problem is that there is not enough talent in the industry for this: both the private and public educational systems, for example in the United States, are not able to supply the industry with a sufficient number of qualified personnel. So of the new IT jobs mentioned, only one out of three will be staffed.

Analysts believe that the role of nurturing qualified IT personnel should be taken directly by companies that urgently need them, since such employees will be their ticket to a new information economy future.

2012

The first skepticism regarding "Big Data"

Analysts from Ovum and Gartner suggest that for a fashionable topic in 2012 big data The time may come to liberate yourself from illusions.

The term “Big Data” at this time typically refers to the ever-increasing volume of information flowing online from social media, sensor networks and other sources, as well as the growing range of tools used to process the data and identify business-relevant data from it. -trends.

“Because of (or despite) the hype around the idea of ​​big data, manufacturers in 2012 looked at this trend with great hope,” said Tony Bayer, an analyst at Ovum.

Bayer reported that DataSift conducted a retrospective analysis of big data mentions in


Close