365bet官方网址 农业 2017年大数据的十大发展趋势

2017年大数据的十大发展趋势



10个2017的预测:数据科学、机器学习和物联网

2017-01-10 10:40  36大数据  

核心提示: 近日, Vincent Granville在Data Science
Central上撰文对2017的数据科学、机器学习和物联网进行了预测。

近日, Vincent Granville在Data Science
Central上撰文对2017的数据科学、机器学习和物联网进行了预测。

以下为译文

又到了分享对2017年的预言的时候了,这里抛砖引玉,也希望各位发表自己的观点。

1.
数据科学和机器学习将变得更为主流,特别在以下领域:能源业、金融业(银行、保险)、农业(精耕)、运输业、城市规划、医疗保健(定制治疗),甚至是政府方面。

2.
某些数据科学的门外汉想要创建合法的,有关如何分析数据、算法怎样运转的体系,并打算强制公开算法的秘诀。我相信他们会失败的,奥巴马医改就是一个例子,其中利用的预测算法,忽略了年龄性别来计算溢价,导致了人们需要承担更高的保费。

3.
传感数据崛起。也就是说,物联网将带来数据膨胀,但数据质量、数据相关性及数据安全仍是至关重要的。

4.
随着物联网的崛起,依赖于人工智能、深度学习和自动数据科学的机器或设备间通信的算法
,更多的过程将实现自动化(如驾驶、医疗诊断和治疗)。我最近也在编写文章来描述机器学习、物联网、人工智能、深度学习和数据科学之间的差异。你可以在DSC进行注册,防止错过该文。

5.
人工智能、物联网、数据科学、机器学习、深度学习和运筹学之间的界限将变得模糊。而统计工程会越来越多地出现在应用程序、机器学习、人工智能和数据科学领域。

6.
许多系统仍然无法正常运转。其解决方法在于人而不是算法。我的文章《为何如此多的机器学习实现都以失败告终》中有提及,一个典型的例子就是“谷歌分析”。谷歌分析不能捕捉到大量明显且又基本的机械交通流,而这种任务对于人来说,根本不需要任何统计和数据科学知识来过滤或拦截。虽然人们发明了基本的方案来解决这些问题,但它却有增无减。虚假评论、新闻,推特上未检测出的仇恨言论,谷歌搜索上未检测到的剽窃行为,都属于同一类别。最终,它为新玩家留下了可以进入和构建实际工作系统的空间。

7.
对公共数据和公共新闻的依赖将会面临更仔细的审查。有人说预测选举的失败是数据科学的失败。但在我看来这是一种不同类型的失败,它未能意识到媒体的偏见(他们发布的都符合他们自身议程预测),甚至那些做调查的都是有偏差的(充满谎言)。同时它也没有意识到选举中的高波动性以及日常的巨大变化。任何能够计算出包含历史数据的良好的置信区间的人,都认为这是不可靠的预测结果。最后,我一直都认为获胜者最擅长玩把戏,包括操纵黑客与贿赂媒体。

8.
越来越多的数据清理、预处理和探索性数据分析将变得自动化,我们也将面对更多的非结构化数据,当然,也会用一些方法来使它们变得结构化。多个算法和模型逐渐混合在一起,来提供效果最好的模式识别和预测系统,以提高精度。

9.
在领先的从业者带领的大学课程的引导下,数据科学教育将不断发展,通过数据科学营找工作的人会更少。许多这种类型的训练营不会培训你成为数据科学家,而是让你变成一个只知晓经典的、基本的、甚至过时且危险的统计知识的Python/R/SQL码农。因此,数据营不得不改善,否则将冒成为另一种凤凰城大学的风险。

10.
对有关数据的基础建设的攻击将会从窃取或清除转变为修改数据。如果安全漏洞没有被修缮,某些攻击将从物联网设备开始。

12. SpagoBI

SpagoBI is an open source business
intelligence and big data analytics platform. The software is completely
free, but paid user support, maintenance, consulting and training are
available for purchase. It includes tools for reporting,
multidimensional analysis (OLAP), charts, location intelligence, data
mining, ETL and more. It also integrates with popular in-memory
processing engines and enables real-time processing.

 

 

 

 

 

 

 

 

 

 

 

 

2017年大数据的十大发展趋势

  佛瑞斯特研究公司(Forrester)的研究人员发现,2016年,近40%的公司正在实施和扩展大数据技术应用,另有30%的公司计划在未来12个月内采用大数据技术。2016年NewVantage
Partners的大数据管理调查发现,62.5%的公司现在至少有一个大数据项目投入生产,只有5.4%的公司没有大数据应用计划,或者是没有正在进行的大数据项目。

  研究人员称,会有越来越多的公司加速采用大数据技术。互联网数据中心(IDC)预测,到2020年大数据和分析技术市场,将从今年的1301亿美元增加至2030亿美元。“公司对数据可用性要求的提高,新一代技术的出现与发展,以及数据驱动决策带来的文化转变,都继续刺激着市场对大数据和分析技术服务的需求”,
IDC副总裁Dan Vesset表示。
“2015年该市场全球收入为1,220亿美元,预计到2016年,这一数字将增长11.3%,并预计在2020年以11.7%的复合年增长率(CAGR)继续增长。”

  编者注:CAGR并不等于现实生活中GR(Growth
Rate)的数值。它的目的是描述一个投资回报率转变成一个较稳定的投资回报所得到的预想值。我们可以认为CAGR平滑了回报曲线,不会为短期回报的剧变而迷失。

  虽然大数据市场将会继续增长这一点毋庸置疑,但企业应该如何应用大数据呢?目前还没有一个清楚的答案。新的大数据技术正在进入市场,而一些旧技术的使用还在继续增长。本文涵盖大数据未来发展的十大趋势,这些趋势可能对2017年及以后的大数据市场产生极大影响。

 

  大数据发展趋势

  专家预计,机器学习、预测分析、物联网和边缘计算将对2017年及以后的大数据项目产生深远影响。

  1

  开放源码

  Apache
Hadoop、Spark等开源应用程序已经在大数据领域占据了主导地位。一项调查发现,预计到今年年底,近60%企业的Hadoop集群将投入生产。佛瑞斯特的研究显示,Hadoop的使用率正以每年32.9%的速度增长。

  专家表示,2017年许多企业将继续扩大他们的Hadoop和NoSQL技术应用,并寻找方法来提高处理大数据的速度。

  2

  内存技术

  很多公司正试图加速大数据处理过程,它们采用的一项技术就是内存技术。在传统数据库中,数据存储在配备有硬盘驱动器或固态驱动器(SSD)的存储系统中。而现代内存技术将数据存储在RAM中,这样大大提高了数据存储的速度。佛瑞斯特研究的报告中预测,内存数据架构每年将增长29.2%。

  目前,有很多企业提供内存数据库技术,最著名的有SAP、IBM和Pivotal。

  3

  机器学习

  随着大数据分析能力的不断提高,很多企业开始投资机器学习(ML)。机器学习是人工智能的一项分支,允许计算机在没有明确编码的情况下学习新事物。换句话说,就是分析大数据以得出结论。

  高德纳咨询公司(Gartner)称,机器学习是2017年十大战略技术趋势之一。它指出,当今最先进的机器学习和人工智能系统正在超越传统的基于规则的算法,创建出能够理解、学习、预测、适应,甚至可以自主操作的系统。

365bet官方网址 1

  4

  预测分析

  预测分析与机器学习密切相关,事实上ML系统通常为预测分析软件提供动力。在早期大数据分析中,企业通过审查他们的数据来发现过去发生了什么,后来他们开始使用分析工具来调查这些事情发生的原因。预测分析则更进一步,使用大数据分析预测未来会发生什么。

  普华永道(PwC)2016年调查显示,目前仅为29%的公司使用预测分析技术,这个数量并不多。同时,许多供应商最近都推出了预测分析工具。随着企业越来越意识到预测分析工具的强大功能,这一数字在未来几年可能会出现激增。

365bet官方网址 2

  5

  智能app

  企业使用机器学习和AI技术的另一种方式是创建智能应用程序。这些应用程序采用大数据分析技术来分析用户过往的行为,为用户提供个性化的服务。推荐引擎就是一个大家非常熟悉的例子。

  在2017年十大战略技术趋势列表中,高德纳公司把智能应用列在了第二位。高德纳公司副总裁大卫·希尔里(David
Cearley)说:“未来10年,几乎每个app,每个应用程序和服务都将一定程度上应用AI。

365bet官方网址 3

  6

  智能安保

  许多企业也将大数据分析纳入安全战略。企业的安全日志数据提供了以往未遂的网络攻击信息,企业可以利用这些数据来预测并防止未来可能发生的攻击,以减少攻击造成的损失。一些公司正将其安全信息和事件管理软件(SIEM)与大数据平台(如Hadoop)结合起来。还有一些公司选择向能够提供大数据分析能力产品的公司求助。

365bet官方网址 4

  7

  物联网

  物联网也可能对大数据产生相当大的影响。根据IDC
2016年9月的报告,“31.4%的受访公司推出了物联网解决方案,另有43%希望在未来12个月内部署物联网解决方案。”

  随着这些新设备和应用程序上线,许多公司需要新的技术和系统,才能够处理和感知来自物联网的大量数据。

365bet官方网址 5

  8

  边缘计算

  边缘计算是一种可以帮助公司处理物联网大数据的新技术。在边缘计算中,大数据分析非常接近物联网设备和传感器,而不是数据中心或云。对于企业来说,这种方式的优点显而易见。因为在网络上流动的数据较少,可以提高网络性能并节省云计算成本。它还允许公司删除过期的和无价值的物联网数据,从而降低存储和基础架构成本。边缘计算还可以加快分析过程,使决策者能够更快地洞察情况并采取行动。

  9

  高薪职业

  对于IT工作者来说,大数据的发展意味着大数据技能人才的高需求。IDC称,“到2018年,美国将有181,000个深度分析岗位,是数据管理和数据解读相关技能岗位数量的五倍。”

  由于人才缺口过大,罗伯特·哈夫技术公司预测,到2017年数据科学家的平均薪资将增长6.5%,年薪在116,000美元到163,500美元之间(当然这是美国的标准,中国国内目前尚未统计)。同样,明年大数据工程师的薪资也将增长5.8%,在135,000美元到196,000美元之间。

365bet官方网址 6

  10

  自助服务

  由于聘请高级专家的成本过高,许多公司开始转向数据分析工具。IDC先前预测,“视觉数据发现工具的增长速度将比其他商业智能(BI)市场快2.5倍,到2018年,所有企业都将投资终端用户自助服务。

365bet官方网址 7

  一些大数据供应商已经推出了具有“自助服务”能力的大数据分析工具,专家预计这种趋势将持续到2017年及以后。
数据分析过程中,信息技术的参与将越来越少,大数据分析将越来越多地融入到所有部门工作人员的工作方式之中。

  英文原文:

Top 10 Trends in Big Data

“Big data” is no longer just a buzzword. Researchers at Forrester have
“found that, in 2016, almost 40 percent of firms are implementing and
expanding big data technology adoption. Another 30 percent are planning
to adopt big data in the next 12 months.”

Similarly, the Big Data Executive Survey 2016 from NewVantage Partners
found that 62.5 percent of firms now have at least one big data project
in production, and only 5.4 percent of organizations have no big data
initiatives planned or underway.

Researchers say the adoption of big data technologies is unlikely to
slow anytime soon. IDC predicts that the big data and business analytics
market will increase from $130.1 billion this year to more than $203
billion in 2020. “The availability of data, a new generation of
technology, and a cultural shift toward data-driven decision making
continue to drive demand for big data and analytics technology and
services,” said Dan Vesset, group vice president, analytics and
information management. “This market is forecast to grow 11.3 percent in
2016 after revenues reached $122 billion worldwide in 2015 and is
expected to continue at a compound annual growth rate (CAGR) of 11.7
percent through 2020.”

While it’s clear that the big data market will grow, how organizations
will be using their big data is a little less clear. New big data
technologies are entering the market, while use of some older
technologies continues to grow. This slideshow covers ten top trends
that will likely shape the big data market in 2017 and beyond.

365bet官方网址 8

Big Data Trends

  Experts expect machine learning, predictive analytics, IoT and edge
computing to have a big impact on big data projects in 2017 and beyond.

  1. Open Source

  Open source applications like Apache Hadoop, Spark and others have
come to dominate the big data space, and that trend looks likely to
continue. One survey found that nearly 60 percent of enterprises expect
to have Hadoop clusters running in production by the end of this year.
And according to Forrester, Hadoop usage is increasing 32.9 percent per
year.

  Experts say that in 2017, many enterprises will expand their use of
Hadoop and NoSQL technologies, as well as looking for ways to speed up
their big data processing. Many will be seeking technologies that allow
them to access and respond to data in real time.

  2. In-Memory Technology

  One of the technologies that companies are investigating in an
attempt to speed their big data processing is in-memory technology. In a
traditional database, the data is stored in storage systems equipped
with hard drives or solid state drives (SSDs). In-memory technology
stores the data in RAM instead, which is many, many times faster. A
report from Forrester Research forecasts that in-memory data fabric will
grow 29.2 percent per year.

  Several different vendors offer in-memory database technology,
notably SAP, IBM, Pivotal.

  Image Source: Micron Technology

  3. Machine Learning

  As big data analytics capabilities have progressed, some enterprises
have begun investing in machine learning (ML). Machine learning is a
branch of artificial intelligence that focuses on allowing computers to
learn new things without being explicitly programmed. In other words, it
analyzes existing big data stores to come to conclusions which change
how the application behaves.

365bet官方网址,  According to Gartner machine learning is one of the top 10 strategic
technology trends for 2017. It noted that today’s most advanced machine
learning and artificial intelligence systems are moving “beyond
traditional rule-based algorithms to create systems that understand,
learn, predict, adapt and potentially operate autonomously.”

  Image Source: MapR

365bet官方网址 9

  4. Predictive Analytics

  Predictive analytics is closely related to machine learning; in
fact, ML systems often provide the engines for predictive analytics
software. In the early days of big data analytics, organizations were
looking back at their data to see what happened and then later they
started using their analytics tools to investigate why those things
happened. Predictive analytics goes one step further, using the big data
analysis to predict what will happen in the future.

  The number of organizations using predictive analytics today is
surprisingly low—only 29 percent according to a 2016 survey from PwC.
However, numerous vendors have recently come out with predictive
analytics tools, so that number could skyrocket in the coming years as
businesses become more aware of this powerful tool.

  Image Source: Gartner

365bet官方网址 10

  5. Intelligent Apps

  Another way that enterprises are using machine learning and AI
technologies is to create intelligent apps. These applications often
incorporate big data analytics, analyzing users’ previous behaviors in
order to provide personalization and better service. One example that
has become very familiar is the recommendation engines that now power
many ecommerce and entertainment apps.

  In its list of Top 10 Strategic Technology Trends for 2017, Gartner
listed intelligent apps second. “Over the next 10 years, virtually every
app, application and service will incorporate some level of AI,” said
David Cearley, vice president and Gartner Fellow. “This will form a
long-term trend that will continually evolve and expand the application
of AI and machine learning for apps and services.”

  Image Source: Microsoft

365bet官方网址 11

  6. Intelligent Security

  Many enterprises are also incorporating big data analytics into
their security strategy. Organizations’ security log data provides a
treasure trove of information about past cyberattack attempts that
organizations can use to predict, prevent and mitigate future attempts.
As a result, some organizations are integrating their security
information and event management (SIEM) software with big data platforms
like Hadoop. Others are turning to security vendors whose products
incorporate big data analytics capabilities.

  Image Source: IBM

365bet官方网址 12

  7. IoT

  The Internet of Things is also likely to have a sizable impact on
big data. According to a September 2016 report from IDC, “31.4 percent
of organizations surveyed have launched IoT solutions, with an
additional 43 percent looking to deploy in the next 12 months.”

  With all those new devices and applications coming online,
organizations are going to experience even faster data growth than they
have experienced in the past. Many will need new technologies and
systems in order to be able to handle and make sense of the flood of big
data coming from their IoT deployments.

  Image Source: Verizon State of the Market: Internet of Things 2016

365bet官方网址 13

  8. Edge Computing

One new technology that could help companies deal with their IoT big
data is edge computing. In edge computing, the big data analysis happens
very close to the IoT devices and sensors instead of in a data center or
the cloud. For enterprises, this offers some significant benefits. They
have less data flowing over their networks, which can improve
performance and save on cloud computing costs. It allows organizations
to delete IoT data that is only valuable for a limited amount of time,
reducing storage and infrastructure costs. Edge computing can also speed
up the analysis process, allowing decision makers to take action on
insights faster than before.

  Image Source: Dell.com

  9. High Salaries

For IT workers, the increase in big data analytics will likely mean high
demand and high salaries for those with big data skills. According to
IDC, “In the U.S. alone there will be 181,000 deep analytics roles in
2018 and five times that many positions requiring related skills in data
management and interpretation.”

As a result of that scarcity, Robert Half Technology predicts that
average compensation for data scientists will increase 6.5 percent in
2017 and range from $116,000 to $163,500. Similarly, big data engineers
should see pay increases of 5.8 percent with salaries ranging from
$135,000 to $196,000 for next year.

  Image Source: Robert Half Technology 2017 Salary Guide for
Technology Professionals

365bet官方网址 14

  10. Self-Service

As the cost of hiring big experts rises, many organizations are likely
to be looking for tools that allow regular business professionals to
meet their own big data analytics needs. IDC has previously predicted
“Visual data discovery tools will be growing 2.5 times faster than rest
of the business intelligence (BI) market. By 2018, investing in this
enabler of end-user self service will become a requirement for all
enterprises.”

365bet官方网址 15

Several vendors have already launched big data analytics tools with
“self-service” capabilities, and experts expect that trend to continue
into 2017 and beyond. IT is likely to become less involved in the
process as big data analytics becomes more integrated into the ways that
people in all parts of the business do their jobs.

  (声明:本文言论不代表亚联观点,也不构成任何操作建议。请读者仅作参考。文章版权归原作者所有,如有侵权,请联系我们进行删除。

  关于我们

  【亚联数据】

  北京亚联融汇数据科技有限公司(简称亚联数据)是亚洲金融合作联盟(简称“AFCA”)旗下专业
服务于中国金融领域的数据公司,致力于金融大数据运用、金融数据分析与研究,专注于中国中小金融业务与管理领域,融合了金融领域管理数据与金融大数据,通过对金融大数据的处理及数据建模能力,为传统金融及互联网金融行业提供营销解决方案、产品设计、风险管理及客户管理等相关解决方案。

欢迎您关注亚联大数据

  【亚洲金融合作联盟】

亚洲金融合作联盟(AFCA,Asia Financial Cooperation
Association)于2012年4月24日在海南省三亚市成立,是全球首创的由非政府机构发起成立的跨地区非政府金融合作组织,也是全球首家设立风险互助基金的金融合作组织。AFCA以“抱团发展、创造多赢、共同超越”为宗旨,以“自愿、公平、主体独立”为合作原则,积极探索联合、合作、共赢的金融发展战略,与全体联盟成员一起共御风险,共用渠道,共建平台。截至2015年底,AFCA已有41家亚洲中小银行、保险、租赁、基金、信托、投资公司等金融机构成员,成员单位资产规模超过15万亿元人民币。特别聘请泰国前总理他信、香港东亚银行现任主席李国宝担任联盟荣誉主席,台湾海基会前会长江丙坤、韩亚金融集团前会长金胜猷、《亚洲银行家》杂志主席以理担任联盟荣誉顾问。

 

 

6. RapidMiner

RapidMiner claims to be the “#1 open source
data science platform,” and Gartner named it a leader in its Magic
Quadrant report for advanced analytics. It enables self-service
predictive analytics and promises lightning-fast performance. Its users
include BMW, Lufthansa, Domino’s Pizza, Sony, Ford, Salesforce, Amnesty
International and GE.The complete RadiMiner Platform includes three
separate pieces: RapidMiner Studio, RapidMiner Server and RapidMiner
Radoop. All three are available under open source or commercial
licenses, and commercial prices depend on the number of users.

7. Storm

Used by companies like Yahoo, Twitter, Spotify, Yahoo, Yelp, Flipboard
and Groupon, Apache Storm is a real-time big
data processing engine. Its website explains, “Storm makes it easy to
reliably process unbounded streams of data, doing for real-time
processing what Hadoop did for batch processing.” Customers can use it
with any database and any programming language. It’s scalable,
fault-tolerant and easy to deploy. Users should note however, that Storm
has not yet reached the 1.0 release level.

1. Hadoop

It would be impossible to talk about open source data analytics without
mentioning Hadoop. This Apache Foundation
project has become nearly synonymous with big data, and it enables
large-scale distributed processing of extremely large data sets.
A survey conducted
by TDWI and SAS found that nearly 60 percent of enterprises expected to
have Hadoop clusters in production by the end of 2016.

However, it should be noted that Hadoop on its own doesn’t enable data
analytics. It’s usually part of a larger solution for gathering insights
from big data.

10. Drill

Apache Drill allows users to use SQL
queries for non-relational data storage systems. It supports a range of
NoSQL and cloud-based data storage systems, including HBase, MongoDB,
MapR-DB, HDFS, MapR-FS, Amazon S3, Azure Blob Storage, Google Cloud
Storage and Swift. It also allows users to search through multiple
datasets stored with different technologies using a single query. In
addition, it supports many popular BI tools.

11. MongoDB

One of the best-known NoSQL
databases, MongoDB is an open-source
non-relational data storage solution. Its customers include MetLife, the
city of Chicago, Expedia, Google, The Weather Channel, BuzzFeed and
Facebook. In addition to the free open source version, the company also
offers a paid Enterprise version and MongoDB Atlas, a cloud-hosted
version. Forrester has named MongoDB a “Leader” for big data NoSQL.

标签:, , ,

相关文章

发表评论

电子邮件地址不会被公开。 必填项已用*标注

网站地图xml地图