By Data Science Girl

Starred articles were potential candidates for our picture of the week published in our weekly digest. Enjoy our new selection of articles and resources (R, data science, Python, machine learning etc.) Comments are from Vincent Granville. For a full list of all resources featured so far, click here.


  1. Code for learning the Structure of Graphical Models **
  2. PokitDok HealthGraph **
  3. Data Wrangling with dplyr and tidyr Cheat Sheet
  4. Deep Learning in a Nutshell
  5. Do we Need Hundreds of Classifiers to Solve Real World Classification Problems? – PDF document
  6. Video: Advanced Machine Learning with scikit-learn
  7. Predictive Modeling with R and the caret Package
  8. Protovis: A Graphical Toolkit for Visualization
  9. R Data: Data Analysis and Visualization Using R
  10. How to Choose Between Learning Python or R First
  11. Top 50 open source web crawlers for data mining
  12. Year 2014 in Review as Seen by a Event Detection System **
  13. Optimization Algorithms in Machine Learning
  14. Machine Learning course Video
  15. Course from Rice University: An Introduction to Interactive Programming in Python (starting soon)
  16. MapReduce: Simplified Data Processing on Large Clusters
  17. MapReduce Online
  18. Distributed Hash Tables, Part I
  19. One Page R: A Survival Guide to Data Science with R
  20. Abridged List of Machine Learning Topics **
  21. Decision Tree Algorithms – Simplified
  22. DataQuest – Browser-based learning for data science

Picture is from resource #2 above


  1. A Startup’s Neural Network Can Understand Video
  2. Sneak peek at the new Pinterest graph-based database
  3. The Top Mistakes Developers Make When Using Python for Big Data Analytics
  4. Microsoft unveils Trill, a .NET library for super-fast streaming analytics
  5. 10 Big Data Experts to Know
  6. How to see data – NPR Video
  7. Microsoft to acquire Revolution Analytics **
  8. Text Analytics Market is Expected to Reach $6.5 Billion by 2020
  9. Avoiding a common mistake with time series
  10. Comparing trends of median income growth since 1984 **
  11. IBM expected to cut 26% of its 430,000 work force
  12. A beautiful story about NYC weather **
  13. Why now is the time to learn R
  14. California’s epidemic of vaccine denial, mapped ** – I am pro-vaxxer but unvaccinated. If there was a preventable disease causing 2000 deaths per year in US, I would take the vaccine. So far, I am not aware of any such disease. Flu does not qualify, I am currently able to fight it myself or avoid it (not sure when was the last time I got flu, it must have been more than 15 years ago or maybe when I was a kid). I got measles when I was a kid, so no need to vaccinate, and anyway, it causes fewer deaths than lightening strikes.
  15. The 3 deadliest drugs in America are legal * – How can you tell that a dual user tabacco/alcohool died from tabacco rather than alcohool, or something else?
  16. Can Microsoft make R easy?
  17. Spark May Be Hotter Than Hadoop, But It Still Has Issues
  18. In-demand big data skills: a mix of old and new
  19. Why now is the time to learn R
  20. Index of well-being, by country *
  21. Was 2014 the hottest year on record? – Interesting discussion about lack of statistical significance.
  22. Where Young College Graduates Are Choosing to Live – The future is people working remotely. On a beach, in the mountains, any place where Internet signal is strong. Even if it means using satellite Internet.
  23. Coaching by numbers: is data analytics the future of management?
  24. A New Mathematical Proof Shows How Some Spaces Can’t Always Be Divided
  25. Visualizing The Cost Of Living Around The World *
  26. Digging Into Data: It’s Not Just for Math Class
  27. What are the steps to Data Driven success? – Infographics
  28. Don’t Worry, Python Isn’t Displacing R
  29. Companies try to automate the data scientist function to deal with skills gap
  30. Why an obscure data-mining company is worth $3 billion
  31. Ray Kurzweil’s Mind-Boggling Predictions for the Next 25 Years
  32. California Mulling More Government Access to Cars’ On-Board Computers – Potential lucrative job of the future: disactivate data tracking and reporting on all sensors, be it car computers, visuses installed on your computer to track your activity, or home sensors. It can probably be performed remotely.
  33. Swiss Authorities Arrest Bot for Buying Drugs and Fake Passport
  34. How Likely Is It That Birth Control Could Let You Down? *
  35. LinkedIn Revamps Its Search Engine For Speed And Relevance
  36. Study Questions Link Between Asthma and City Living – Illustrates the difference between cause and correlation, and the impact of obfuscating variables in a model.
  37. Apple buys music analytics startup Semetric to bolster Beats
  38. What Are the Odds That Stats Would Be This Popular? – Most of these high ROI people don’t call themselves statisticians, their job title is typically data scientist, data engineer, analyst, or data miner. And they don’t work on theoretical problems. Nearest neighbor classifiers have several drawbacks: optimizing K, even locally, is not the solution. Yet another NYT article full of garbage and disguised advertising.
  39. NYC data science on Twitter ** – Infographics
  40. Facebook open sources its cutting-edge deep learning tools
  41. How politicians are unlike America **
  42. How Nonemployed Americans Spend Their Weekdays: Men vs. Women **

Previous selection of external articles available here.

DSC Resources

Additional Reading

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge

Read more here:: Data Science Central Featured Blog Posts