Jowanza Raheem Joseph

42 Added | 3 Magazines | 503 Likes | 4 Followers | @JowanzaRahe2tmb | Keep up with Jowanza Raheem Joseph on Flipboard, a place to see the stories, photos, and updates that matter to you. Flipboard creates a personalized magazine full of everything, from world news to life’s great moments. Download Flipboard for free and search for “Jowanza Raheem Joseph”

My Productivity Routines


Thoughts on Dremio

Thoughts on IFTTT

Over-engineering My Smart Home

Bose Quite Comfort 35 vs Sony 1000X

2 Months With The Bose SoundSport Pulse

The Spanner Paper

3 Months With The Amazon Echo Show

Jathena: An Open Source Amazon Athena

Software Development

Partitions in Apache Spark

My Experience Submitting Proposals to Big Data Conferences

15 Minutes a Day

The Pleasure and Boredom of Doing Too Much

The Promise and Expense of Distributed In-Memory Data Processing

Big Data

Ergonomics in Data Engineering


Beginning Apache Flink

I’ve been committed to the Apache Spark ecosystem for as long as I’ve been doing data engineering. I’ve seen it’s adoption, been a fan of using the …

Big Data

Creating A Spark Server For Every Job With Livy

One of the frustrations that most people who are new to Spark have, is how exactly to run Spark. Before running your first Spark job you’re likely to …


How Alluxio is Accelerating Apache Spark Workloads

apache spark<p>How Alluxio is Accelerating Apache Spark Workloads<p>Alluxio is fast virtual storage for Big Data. Formerly known as Tachyon, it’s an …

A Gentle Intro To Graph Analytics With GraphFrames

Anyone steeped in the doctrine of relational databases will find that trying to use a graph database like Neo4J is painful and not at all intuitive. …


Window Functions In SparkSQL

SparkSQL provides an easy to use API for distributed datasets on the Spark Platform. It’s trivial to do sums, group by’s, pivots and other …

Data Science
Home Automation

Time-Series Missing Data Imputation In Apache Spark

In a recent project, I needed to do some time-based imputation across a large set of data. I tried to implement my own solution with moderate success …

Data Science