Diving into Data Science

By thiakx | The latest happenings, startups, tech in the world of data science, analytics and visualization.

Deep Residual Learning for Image Recognition

Today’s paper offers a new architecture for Convolution Networks. It was written by He, Zhang, Ren, and Sun from Microsoft Research. I’ll warn you …

Apache Spark @Scale: A 60 TB+ production use case

Facebook often uses analytics for data-driven decision making. Over the past few years, user and product growth has pushed our analytics engines to …

Big Data

Reporting on Indonesian peat fires: Which data can you trust?

Joint research by Indonesian and US scientists shows that peat fires in Central Kalimantan in Indonesia released less carbon dioxide than projected …

Climate Change

Machine Learning in a Year

From being a total ml noob to start using it at work<p>This is a follow up to an article I wrote last year, <i>Machine Learning in a Week,</i> on how I …

Explore the stars with this interactive Star Mapper

Jan Willem Tulp, in collaboration with the European Space Agency, produced the ESA Star Mapper. It shows nearly 60,000 stars in a combination of …

models/lm_1b at master · tensorflow/models

<b>Language Model on One Billion Word Benchmark</b><p><b>Authors:</b><p>Oriol Vinyals (vinyals@google.com, github: OriolVinyals), Xin Pan (xpan@google.com, github: …

How to teach logic to your #NeuralNetworks ?

Logic gates are the fundamental building blocks of electronics. They pretty much make up the complex architecture of all computing systems today. …

Deep Learning

21 Must-Know Data Science Interview Questions and Answers

KDnuggets Editors bring you the answers to 20 Questions to Detect Fake Data Scientists, including what is regularization, Data Scientists we admire, …

Data Science

simulacrum

###Simulacrum Simulacrum is a simple way to pass in a dictionary object, with column names and corresponding data types and output a pandas DataFrame …

Ask Why! Finding motives, causes, and purpose in data science

Some people equate predictive modelling with data science, thinking that mastering various machine learning techniques is the key that unlocks the …

Data Science

A Practical Introduction to Deep Learning with Caffe and Python // Adil Moujahid // Data Analytics and more

Deep learning is the new big trend in machine learning. It had many recent successes in computer vision, automatic speech recognition and natural …

10 Machine Learning Online Courses For Beginners

The following is a list of, mostly free, machine learning online courses for beginners.<p>If video lectures aren’t your thing, and books better suit …

Announcing the R Shapefile Contest

Today I am happy to announce the <i>R Shapefile Contest</i>. The goal of the contest is to encourage and promote high quality work at the intersection of R …

The impact on jobs: Automation and anxiety

Special report<p>Artificial intelligence: The impact on jobsAutomation and anxiety<p>Will smarter machines cause mass unemployment?<p>print-edition iconFrom …

Deep Learning Demystified

Square pie chart beats out the rest in perception study

Many hate pie charts. Others love them. I think they’re useful but have limitations. Most of these are just feelings though, maybe accompanied by an …

Data Visualization

JupyterLab: the next generation of the Jupyter Notebook

Learning the lessons of the Jupyter Notebook<p>It's been a long time in the making, but today we want to start engaging our community with an early …

Introduction to Deep Learning for Image Recognition

This notebook accompanies the <i>Introduction to Deep Learning for Image Recognition</i> workshop to explain the core concepts of deep learning with emphasis …

Selecting Columns Programmatically Using Column Expressions Tutorial

This page is part of the documentation for the Machine Learning Database.<p>It is a static snapshot of a Notebook which you can play with interactively …

The Theorem Every Data Scientist Should Know (Part 2)

13 Jul 2016<p>Last week, I wrote a post about the Central Limit Theorem. In that post, I explained through examples what the theorem is and why it’s so …

Altair

Altair is a declarative statistical visualization library for Python.<p><i>Altair is developed by Brian Granger and Jake Vanderplas in close collaboration</i> …

Python Programming

Applications

• Share on ...<p>Old revisions<br>• Backlinks<br>• Export to PDF<p>When we first are introduced to deep learning, we see it as a better machine learning classifier. …

Building a data science portfolio: Machine learning project

<i>This is the third in a series of posts on how to build a Data Science Portfolio. If you like this and want to know when the next post in the series</i> …

Continuously Collecting API Data with Python and Digital Ocean

A few months ago, I wanted to collect minutely Uber surge price data over the course of a week between Penn Station, NY. and Union Square, NY. I …

Python Pandas Functions in Parallel - Jay's Website

I’m always on the lookout for quick hacks and code snippets that might help improve efficiency. Most of the time that’s through stackoverflow but …

Machine Learning is Fun!

The world’s easiest introduction to Machine Learning<p><b>Update:</b> <i>This article is part of a series. Check out the full series:</i> <i>Part 1</i><i>,</i> <i>Part 2</i><i>,</i> <i>Part 3</i><i>,</i> <i>Part 4</i><i>,</i></i> …

Effective Pandas

A collection of notebooks behind my series on writing idiomatic pandas.<p>Contents<p>Modern Pandas<br>• Method Chaining<br>• Indexes<br>• Fast Pandas<br>• Tidy …

The Theorem Every Data Scientist Should Know

04 Jul 2016<p>Yesterday, I was reading a thread on Quora. The people in this thread where answering the following question: What are 20 questions to …