# Data Science

### How to Build a Data Science Portfolio

<i>A portfolio is one way to show people you are that data science unicorn.</i><p><b>How do you get a job in data science?</b> Knowing enough statistics, machine …

Big Data### Building a Linear Regression Model for Real World Problems, in R

In this blog, i will try to make this concept of regression simple and intuitive for everyone. Understanding maths behind a concept is always a must …

Artificial Intelligence### Zillow Data Science Interview Questions — Acing the AI Interview

Zillow lists data on 110 Million homes in their living database and 6.3 Billion visits to their site in 2018 per their investor relations.<p>According …

Real Estate### Probabilistic Programming Primer

<b>Data Science and interpretability</b><p>I’ve been involved in industrial applications of machine learning, analytics and what is generally referred to as …

Machine Learning### Microsoft R Open 3.5.1 now available

Microsoft R Open 3.5.1 has been released, combining the latest R language engine with multi-processor performance and tools for managing R packages …

Linux### Probability and Tennis

What is the probability winning a tennis match?<p>Your chance, obviously, depends on the relative ratio of your skill at the sport compared to that of …

Board Games### R for Data Science Solutions

Welcome<p>This contains solutions to the exercise in <i>R for Data Science</i>, byn Hadley Wickham and Garret Grolemund (Wickham and Grolemund 2017). The …

Creative Commons Attribution### data.table is Really Good at Sorting

The data.table R package is <i>really</i> good at sorting. Below is a comparison of it versus dplyr for a range of problem sizes.<p>The graph is using a log-log …

Data Science### Extracted variable name dyplyr::mutate

I want to extract a variable name from a data frame a create a new variable with dplyr::mutate. What do I have to write so that the variable name …

Data Science### A novel probabilistic forecast system predicting anomalously warm 2018-2022 reinforcing the long-term global warming trend

Article |<p>Open | Published: 14 August 2018<p><i>Nature Communications</i><b>volume 9</b>, Article number: 3024 (2018) | Download Citation<p>Abstract<p>In a changing climate, …

Climate Change### It was the weeds that bothered him. - Statistical Modeling, Causal Inference, and Social Science

Bill Jefferys points to this news article by Denise Grady. Bill noticed the following bit, “In male rats, the studies linked tumors in the heart to …

Statistical Modeling### DataHack Radio Podcast - Data Science in India with Dr. Avik Sarkar

We use cookies on Analytics Vidhya websites to deliver our services, analyze web traffic, and improve your experience on the site. By using Analytics …

India### Amazon Translate now available in the Memsource translation management system | Amazon Web Services

<i>This is a guest blog post by Andrea Tabacchi, the Solution Architects team lead at Memsource.</i><p>Memsource is always looking out for exciting new …

Amazon Web Services### Hierarchical Bayesian Neural Networks with Informative Priors

In [2]:<p>In [5]:<p>Next, we loop over each category and fit a different BNN to each one. Each BNN has its own weights and there is no connection between …

Cognitive Computing### OpenCV People Counter

By onAugust 13, 2018 in Object Tracking, Tutorials<p>Click here to download the source code to this post.<p>In this tutorial you will learn how to build a …

Python Programming### A NLP Guide to Text Classification using Conditional Random Fields

We use cookies on Analytics Vidhya websites to deliver our services, analyze web traffic, and improve your experience on the site. By using Analytics …

Taxonomy### PCA revisited

(This article was first published on <b> The Beginner Programmer</b>, and kindly contributed to R-bloggers)Principal component analysis (PCA) is a …

Mobile Tagging### Training ImageNet in 18 minutes for $40; courteous self-driving cars; and Google evolves alternatives to backprop

<b>Better robot cars through programmed courteousness:</b> <i><br>…Defining polite behaviors leads to better driving for everyone…</i> <br>How will self-driving cars and …

Machine Learning### 15 Statistical Hypothesis Tests in Python (Cheat Sheet)

By onAugust 15, 2018 in Statistics<p>Quick-reference guide to the 15 statistical hypothesis tests that you need in<br>applied machine learning, with sample …

Machine Learning### Analytical approach to network inference: Investigating degree distribution

Author(s): Gloria Cecchini and Björn SchelterWhen the network is reconstructed, two types of errors can occur: false positive and false negative …

Data Science### A New Type of Leaderboard: Season Stat Grid!

We are debuting a new leaderboard today, the Season Stat Grid. It’s a little different than most of our leaderboards. Instead of having multiple …

Sports Analytics### How to remove columns and rows that sum to 0 while preserving non-numeric columns

Below is a subset of my data. I am trying to remove columns AND rows that sum to 0 ... the catch is that I want to preserve columns 1 to 8 in the …

Data Science### Legal Tech: How Can Lawyers Benefit?

The law has always been part and parcel of the human experience. Ever since ancient times, laws have been established to help guide and regulate …

Email Marketing### Curalate makes social sell with AI using Apache MXNet on AWS | Amazon Web Services

Curalate helps brands convert social influence into sales. The Philadelphia-based startup makes it easy for digital-savvy consumers to make …

Machine Learning### How feminism has made me a better scientist

Feminism is not a branch of science. It is not a set of testable propositions about the observable world, nor is it any single research method. From …

Feminism### Bar plot ggplot2 - Error: Aesthetics must be either length 1 or the same as the data (150): fill, x, y

Hey I know there are lots of questions about this particular error but i still cant find what is wrong, pretty new to R and coding in general. here …

Data Science### Curtin University establishes Data Science Innovation Hub

Businesses and startups welcomed aboard.

Technology (Australia)### Achieving a smooth color ramp

I have heat map in Excel that im trying to recreate in R. Its basically data for an RFM segmentation and in excel the color range is great but im …

Data Science### geom_text not properly positioned when using position_dodge(preserve="single") in bar plot

With the code below the labelling in facet bB is not correctly positioned. The problem seems to originate from the fact that there is no …

Faceted Classification### How to use select() only on columns of a certain type without loosing columns of other types?

There are a some similar questions (like here, or here), but none with quite the answer I am looking for.<p>The question:<p>How to use select() only on …

Databases