Big Data

By fasoulas | fasoulas created a magazine on Flipboard. “Big Data on Flipboard” is available with thousands of other magazines and all the news you care about. Download Flipboard for free and search for “fasoulas”.

Build a Healthcare Data Warehouse Using Amazon EMR, Amazon Redshift, AWS Lambda, and OMOP | AWS Big Data Blog

In the healthcare field, data comes in all shapes and sizes. Despite efforts to standardize terminology, some concepts (e.g., blood glucose) are …

Cloud Computing

A technical overview of Azure Cosmos DB

<i>Microsoft’s globally distributed, multi-model database service – A technical overview</i><p>Azure Cosmos DB is Microsoft’s globally distributed, …

Cloud Computing

Optimization tips and tricks on Azure SQL Server for Machine Learning Services

Summary<p>Since SQL Server 2016, a new function called R Services has been introduced. Microsoft recently announced a preview for the next version of …

Machine Learning

Encrypt and Decrypt Amazon Kinesis Records Using AWS KMS | AWS Big Data Blog

Customers with strict compliance or data security requirements often require data to be encrypted at all times, including at rest or in transit …

Amazon Web Services

Securely Analyze Data from Another AWS Account with EMRFS | AWS Big Data Blog

Sometimes, data to be analyzed is spread across buckets owned by different accounts. In order to ensure data security, appropriate credentials …

Amazon Web Services

Data, data, and more data – How and Where to store it on Azure?

A while back Amy Nicholson and I kicked off TechDays Online with our session “Data, data, data – How and Where to store it on Azure?”.<p>The idea was to …

Programming

Azure Stream Analytics Tools for Visual Studio

Have you had chance to try out the public preview version of the Azure Stream Analytics Tools for Visual Studio yet? If not, read through this blog …

Microsoft Visual Studio

Create an End-to-end IOT Scenario with Azure Services

IOT or Internet of things is taking over the world and we are encountering more and more of it in our daily lives. But when we come to think of it …

Cloud Computing

Hazelcast Launches an Open Source In-Memory Stream Processing Engine

Hazelcast, known chiefly for its open source in-memory data grid (IMDG), has launched an open source lightweight, distributed data-processing engine …

Big Data

Serving Real-Time Machine Learning Predictions on Amazon EMR

The typical progression for creating and using a trained model for recommendations falls into two general areas: training the model and hosting the …

Cloud Computing

Amazon Redshift Engineering’s Advanced Table Design Playbook: Preamble, Prerequisites, and Prioritization

<i></i> <i>Zach Christopherson is a Senior Database Engineer on the Amazon Redshift team.</i><b>Part 1: Preamble, Prerequisites, and Prioritization</b> <br>Part 2: …

Databases

Run Jupyter Notebook and JupyterHub on Amazon EMR

<i>Tom Zeng is a Solutions Architect for Amazon EMR</i>Jupyter Notebook (formerly IPython) is one of the most popular user interfaces for running Python, R, …

Python Programming

Derive Insights from IoT in Minutes using AWS IoT, Amazon Kinesis Firehose, Amazon Athena, and Amazon QuickSight

<i>Ben Snively is a Solutions Architect with AWS</i>Speed and agility are essential with today’s analytics tools. The quicker you can get from idea to first …

Cloud Computing

R Server 9 Adds Machine Learning to Work with Your Data Where It Lives

Built by data scientists, the R programming language has always been a tool for data scientists. But Microsoft’s R Server 9, the first full new …

Analyzing Data in S3 using Amazon Athena | AWS Big Data Blog

<i>Neil Mukerje is a Solution Architect for Amazon Web Services</i> <i><br>Abhishek Sinha is a Senior Product Manager on Amazon Athena</i><p>Amazon Athena is an …

Implementing Authorization and Auditing using Apache Ranger on Amazon EMR

<i>Varun Rao is a Big Data Architect for AWS Professional Services</i>Role-based access control (RBAC) is an important security requirement for multi-tenant …

Cloud Computing

Low-Latency Access on Trillions of Records: FINRA’s Architecture Using Apache HBase on Amazon EMR with Amazon S3

<i>John Hitchingham is Director of Performance Engineering at FINRA</i>The Financial Industry Regulatory Authority (FINRA) is a private sector regulator …

Cloud Computing

Real-time Clickstream Anomaly Detection with Amazon Kinesis Analytics

<i>Chris Marshall is a Solutions Architect for Amazon Web Services</i>Analyzing web log traffic to gain insights that drive business decisions has …

Cloud Computing

A Data Sharing Platform Based on AWS Lambda | AWS Compute Blog

Julien Lepine<br>Solutions Architect<p>As developers, one of our top priorities is to build reliable systems; this is a core pillar of the AWS Well …

Installing and Running JobServer for Apache Spark on Amazon EMR

<i>Derek Graeber is a senior consultant in big data analytics for AWS Professional Services</i>Working with customers who are running Apache Spark on Amazon …

Cloud Computing

Serverless Big Data pipeline on AWS

Lambda is a powerful tool when integrating different services on AWS. During the last months, I've successfully used serverless architectures to …

Cloud Computing

How SmartNews Built a Lambda Architecture on AWS to Analyze Customer Behavior and Recommend Content

<i>This is a guest post by Takumi Sakamoto, a software engineer at SmartNews. SmartNews in their own words: "SmartNews is a machine learning-based news</i> …

Cloud Computing

All the Apache Streaming Projects: An Exploratory Guide

The speed at which data is generated, consumed, processed, and analyzed is increasing at an unbelievably rapid pace. Social media, the Internet of …

Big Data

GitHub on BigQuery: Analyze all the open source code

Posted by Felipe Hoffa, Google Developer Advocate<br>Google, in collaboration with GitHub, is releasing an incredible new open dataset on Google …

Big Data

Scale out your existing MySQL landscape with Scalebase

In a nut shell, Scalebase is to MySQL what Greenplum DB is to postgresql, it makes it possible to create an MPP database based on mysql. You can use …

MySQL

Real-time in-memory OLTP and Analytics with Apache Ignite on AWS

<i>Babu Elumalai is a Solutions Architect with AWS</i>Organizations are generating tremendous amounts of data, and they increasingly need tools and systems …

Big Data

Apache Gets Yet Another Stream Processing Engine with Apex

The recent promotion of DataTorrent’s Apex to an Apache Software Foundation top-level project gives the foundation yet another open source engine for …

Big Data

Monitoring Performance Across a Data Pipeline

In my last POST, I outlined a framework to monitor the performance of data processing frameworks like Apache Storm, Spark, Kafka, etc. These data …

Big Data

Analyze a Time Series in Real Time with AWS Lambda, Amazon Kinesis and Amazon DynamoDB Streams

<i>This is a guest post by Richard Freeman, Ph.D., a solutions architect and data scientist at JustGiving. JustGiving in their own words:</i> "<i>We are one of</i> …

Cloud Computing