Parviz Deyhim

27 Flips | 2 Magazines | 325 Likes | 36 Following | 15 Followers | @pdeyhim | Keep up with Parviz Deyhim on Flipboard, a place to see the stories, photos, and updates that matter to you. Flipboard creates a personalized magazine full of everything, from world news to life’s great moments. Download Flipboard for free and search for “Parviz Deyhim”

Understand the GDPR in 10 minutes

or, “Just tell me what I have to do, man.”<p>Like a school kid who hasn’t started their year-end project by May, are you kept awake at night with a …

Simon Wardley on Twitter: "X : Any strategy tips? Me : Yes ... Don't. a) Get 10-15 ppl in your org to colour the chart (attached) which will probably look "all red" (see banking). b) Take action to make it "more green" c) When it looks more like the "e-commerce" giant, you'll be ready to discuss strategy.… https://t.co/GiE3zeun93"

New conversation<p>It's always startling to ask people to draw out their doctrine to begin with, teach them some basics of mapping and ask them that …

Data Encodings and Layout

Like many other messaging products and services, the services I build with my team at Microsoft mostly take a neutral stance towards payload data. We …

real-time, infinite data storage | It’s time to realize Apache Kafka’s full potential, spanning past and present

Kafka users enjoy a broad sweet spot, one that can naturally grow in the context of use cases and in concert with the organization that runs it. You …

Quick 'n Easy Population of Realistic Test Data into Kafka

<b>tl;dr</b> Use curl to pull data from the Mockaroo REST endpoint, and pipe it into kafkacat, thus:<p>Three things I love…Kafka, kafkacat, and Mockaroo. And …

Yes We Can! Distributed ACID Transactions with High Performance

<b>Atomicity</b> refers to all the work in a transaction being treated as one atomic unit — either all of it is performed or none of it is.<p><b>Consistency</b> …

A Look at Ten New Database Systems Released in 2017

As editor of Database Weekly, a weekly newsletter on what’s new in the world of databases and data storage generally, I enjoy poking around new …

Databases

Data Infrastructure at In Loco

<i>Every single</i> tech company that operates at a very large scale will tell you about the importance of knowing how to properly handle data transport and …

Crowdsourcing big-data analysis

In the analysis of big data sets, the first step is usually the identification of “features” — data points with particular predictive power or …

Ranking Websites in Real-time with Apache Kafka’s Streams API

<i>This article is by Hunter Kelly, Technical Architect at Zalando. Hunter enjoys using technology, and in particular machine learning, to solve</i> …

Best-Ever Algorithm Found for Huge Streams of Data

To efficiently analyze a firehose of data, scientists first have to break big numbers into bits.<p>It’s hard to measure water from a fire hose while …

Apache Kafka and the four challenges of production machine learning systems

<i>Untangling data pipelines with a streaming platform.</i>Machine learning has become mainstream, and suddenly businesses everywhere are looking to build …

Analyzing Twitter Hashtag Impact using Neo4j, Python & JavaScript

This is the first demo I developed with Neo4j. The objective of the demo is to open the discussion about graph databases, Neo4j, big data, analytics …

Using Kafka Streams API for predictive budgeting

Boyang Chen | Pinterest engineer, Ads infrastructure<p>At Pinterest, we use Kafka Streams API to provide inflight spend data to thousands of ads servers …

Arbitrary Stateful Processing in Apache Spark’s Structured Streaming

<i>This is the seventh post in a multi-part series about how you can perform complex streaming analytics using Apache Spark and Structured …

Cost Based Optimizer in Apache Spark 2.2

<i>This is a joint engineering effort between Databricks’ Apache Spark engineering team (Sameer Agarwal and Wenchen Fan) and Huawei’s engineering team</i> …

Separation of compute and state in Google BigQuery and Cloud Dataflow (and why it matters) | Google Cloud Big Data and Machine Learning Blog | Google Cloud

Tuesday, October 10, 2017<p><i>Posted by Tino Tereshko, Big Data Lead, Google Cloud Platform Office of CTO. (Thanks to Rodd Zurcher, Engineering Director,</i> …

Using Machine Learning to predict parking difficulty

Posted by James Cook, Yechen Li, Software Engineers and Ravi Kumar, Research Scientist<br>"<i>When Solomon said there was a time and a place for everything</i> …

StreamAlert: Real-time Data Analysis and Alerting

Today we are incredibly excited to announce the open source release of StreamAlert, a real-time data analysis framework with point-in-time alerting. …

Actors

Summary<p>The transition from sequential to parallel computation is an area of critical concern in today's computer technology, particularly in …

CAP Twelve Years Later: How the "Rules" Have Changed

<i>This article first appeared in Computer magazine and is brought to you by InfoQ & IEEE Computer Society.</i><p>The CAP theorem asserts that any net­worked …

The Google File System – Google AI

We have designed and implemented the Google File System, a scalable distributed file system for large distributed data-intensive applications. It …

Availability and Partition Tolerance

When I’m talking to people about Eric Brewer’s CAP Theorem, one of the things that’s hardest to explain is the operative definitions of availability …

Burn the Library

Write contention occurs when two people try to update the same piece of data at the same time.<p>We know several ways to handle write contention, and …

Handling Disk Failures In Cassandra 1.2

Cassandra is great at handling entire node failures. It's not just robust, it's almost indestructible.<p>But until Cassandra 1.2, a single unavailable …