At Strata + Hadoop World which is happening in San Jose today, Microsoft is making several announcements to continue delivering on their commitment to make big data processing and analytics simpler and more accessible. They announced R Server for Azure HDInsight, a 100% open source R implementation. It runs the most comprehensive set of ML algorithms and statistical functions in the cloud, leveraging Hadoop and Spark. Also, Spark for Azure HDInsight has been updated to the latest Apache Spark 1.6 edition, gaining critical performance improvements including a 10x speedup for streaming state management, automatic memory management, and new machine learning algorithms and capabilities.
- Advanced analytics at scale with R Server for HDInsight and the latest version of Spark for HDInsight are now available in preview: Customers can leverage their existing R skills and reuse current code to run at scale. R Server for HDInsight offers popular scalable R algorithms and the ability to parallelize any existing R function. We are also releasing the latest version of Spark for HDInsight, which can deliver 7x performance over MapReduce for most analytics. These capabilities give our customers the ability to train and run advanced analytics and ML models on larger datasets, and much faster than previously possible in the cloud.
- Out-of-the-box application integration, providing easier access to popular big data apps: Customers can now discover and deploy popular big data applications with HDInsight without any code or scripting required. Leading solutions such as Datameer Cloud offer code-free data preparation, AtScale has cloud-based OLAP BI on Hadoop, and an ecosystem of other big data applications can now be deployed alongside HDInsight.
- Azure Data Catalog, previously announced as a public preview will be generally available tomorrow: Data Catalog is an enterprise metadata catalog and portal for the self-service discovery of data sources. Users can now spend less time trying to find, understand and access the data they need, and more time analyzing it for value.
Read more about it here.