Data Science

Unsupervised Machine Learning for Fun & Profit with Basket Clusters

I finally beat the S&P 500 by 10%. This might not sound like much but when we’re dealing with large amounts of capital and with good liquidity, the …

Machine Learning

Static sensitivity analysis - Statistical Modeling, Causal Inference, and Social Science

After this discussion, I pointed Ryan Giordano, Tamara Broderick, and Michael Jordan to Figure 4 of this paper with Bois and Jiang is an example of a …

Ask 3 Kinds Of Questions About Masters In Data Science Degrees

Email comes to me in nearly every day, full of questions about getting a Masters In Data Science. It’s good that prospective students are asking questions. It’s just that they’re asking the wrong person.<p>Last summer, I wrote an article summarizing the cautions that I share with the many people who …

Big Data

Gold Data Science

The price of gold went up $12 this week, and that of silver $0.50. That’s not bad for gold and silver owners, and not good for the vast majority who …

gold

5 ways to measure running time of R code

(This article was first published on <b> Alexej's blog</b>, and kindly contributed to R-bloggers)A reviewer asked me to report detailed running times for all …

R Language

How to Insert Results of Stored Procedure into a Temporary Table? - Interview Question of the Week #124

<b>Question:</b> How to Insert Results of Stored Procedure into a Temporary Table?<p><b>Answer:</b> Very simple question, but the answer is sometimes not as easy as …

65 Free Data Science Resources for Beginners

In this guide, we’ll share <b>65 free data science resources</b> that we’ve hand-picked and annotated for beginners.<p>To become data scientist, you have a …

Machine Learning

Implementing Decision Trees using Scikit-Learn

What is Scikit-Learn?<p>Scikit-Learn is a popular library for Machine Learning in python programming language. If you want to test your knowledge with …

Machine Learning

Neural Networks are Overrated

Deep Learning

Testing the Hierarchical Risk Parity algorithm

(This article was first published on <b>R – QuantStrat TradeR</b>, and kindly contributed to R-bloggers)<p>This post will be a modified backtest of the Adaptive …

R Language

5 Lessons Marketers Can Learn from Airbnb's Data Science Manager

May 25, 2017<p>Share this content:<p>The marketplace for short-term home rentals relies on data and testing and learning to drive ROI<p>Photo source: …

Marketing

Everything that Works Works Because it's Bayesian: Why Deep Nets Generalize?

HINT: because they are really just an approximation to Bayesian machine learning.<p>One of the hottest topics at ICLR this year was generalisation in …

Deep Learning

Data Science and Machine Learning in Practice

"Data Science and Machine Learning in Practice" - Keynote Speech by Dr David Hardoon, Chief Data Officer, Monetary Authority of Singapore, at the 7th …

Big Data

Convergence of Langevin MCMC in KL-divergence. (arXiv:1705.09048v1 [stat.ML])

Authors: Xiang Cheng, Peter BartlettLangevin diffusion is a commonly used tool for sampling from a given distribution. In this work, we establish …

Machine Learning

Visualizing your fitted Stan model using ShinyStan without interfering with your Rstudio session - Statistical Modeling, Causal Inference, and Social Science

ShinyStan is great, but I don’t always use it because when you call it from R, it freezes up your R session until you close the ShinyStan window.<p>But …

Statistical Modeling

RQGIS release 1.0.0

Today we are proud to announce a major release of the RQGIS package providing an interface between R and QGIS. We have completeley rewritten RQGIS by …

Python Programming

Test Set for IVP Solvers

Both engineers and computational scientists alike will benefit greatly from having a standard test set for Initial Value Problems (IVPs) which …

SAS Grid Manager

Author: Steven O'Donoghue | Category: global te, Global Technology Practice, high-performance analytics, SAS Grid Manager, SAS Visual Analytics, SAS …

Software Development

DataScience.com Releases Python Package for Interpreting the Decision-Making Processes of Predictive Models

DataScience.com has released a beta version of <b>Skater</b>, its new Python library for interpreting predictive models. Skater uses a combination of …

Machine Learning

A practical explanation of a Naive Bayes classifier

The simplest solutions are usually the most powerful ones, and Naive Bayes is a good proof of that. In spite of the great advances of the Machine …

Statistics

Subplots in maps with ggplot2

(This article was first published on <b>Ilya Kashnitsky</b>, and kindly contributed to R-bloggers)<p>Following the surprising success of my latest post, I …

R Language

8 out of 10 cats fear statistics – AI doesn't have this problem

If statistics were a human being, it would have been in deep therapy all of its 350-year life. The sessions might go like this:<p><b>Statistics:</b> "Everyone …

Getting off the Struggle Bus, Part 2: Cleaning and Plotting Public Transit Data

<i>[This is Part 2 in a 3-part series on my experiences as a mentee in the Chicago Python User Group (ChiPy) mentorship program. Read</i> <i>Part 1</i> …

Python Programming

The GEOMAGIA database

Funded by:<p>SPP 1488<p>Available Models<p>This section gives an overview of the available global geomagnetic field models that can be calculated using the …

Google Earth

Cultivating Data Scientists

Data science is now table stakes. Five years ago, it was identified as the sexiest job of the twenty-first century. Now, nearly all competitive …

DevOps

Data wrangling : I/O (Part-1)

(This article was first published on <b>R-exercises</b>, and kindly contributed to R-bloggers)<p>Data wrangling is a task of great importance in data analysis. …

Python Programming

Genomic epidemiology reveals multiple introductions of Zika virus into the United States

Zika virus (ZIKV) is causing an unprecedented epidemic linked to severe congenital abnormalities. In July 2016, mosquito-borne ZIKV transmission was …

Viruses

The Sierpinski Triangle: Visualising infinity in R

(This article was first published on <b>The Devil is in the Data</b>, and kindly contributed to R-bloggers)<p>Wacław Sierpiński was a mathematical genius who …

Mathematics

CI for difference between independent R square coefficients

(This article was first published on <b>R code – Serious Stats</b>, and kindly contributed to R-bloggers)<p>In an earlier blog post I provided R code for a CI …

R Language

Airbnb is running its own internal university to teach data science

Tech companies, and increasingly even non-tech companies, are struggling with the fact that there are not enough trained data scientists to fill market demand. Every company has their own strategy for hiring and training, but Airbnb has taken things a step further — running its own university-style …

Technology