Thursday, October 29, 2015

Lukas Biewald, CrowdFlower // Enriching Your Data

Lukas Biewald, CrowdFlower // Enriching Your Data



Lukas Biewald, CrowdFlower // Enriching Your Data from FirstMark Capital

 Lukas Biewald, CrowdFlower // Enriching Your Data

  1. 1. Lukas Biewald
  2. 2. 2
  3. 3. The Effect of Better Algorithms 0% 5% 10% 15% 20% 25% Naïve Bayes Maximum Entropy SVM Classifier Error Rate Active Semi-Supervised Learning for Improving Word Alignment (Vamshi ACL ’10) Real World Data
  4. 4. The Effect of Better Features 0% 5% 10% 15% 20% 25% 30% Unigrams Bigrams Unigrams+Bigrams Classifier Error Rate
  5. 5. The Effect of More Data Active Semi-Supervised Learning for Improving Word Alignment (Vamshi ACL ’10) Real World Data 0% 2% 4% 6% 8% 10% 12% 14% N 2N 4N Classifier Error Rate
  6. 6. The Effect of Cleaner Data 0% 2% 4% 6% 8% 10% 12% 14% 90% Accurate Data 95% Accurate Data 100% Accurate Data Classifier Error Rate
  7. 7. Where Do Data Scientists Spend Their Time? Source: CrowdFlower Data Science Report 2015
  8. 8. CrowdFlower Data Enrichment Platform 8
  9. 9. Color Data 9
  10. 10. 10
  11. 11. 11
  12. 12. 12
  13. 13. 13
  14. 14. 14
  15. 15. 15
  16. 16. Apple Watch 16
  17. 17. Apple Watch 17
  18. 18. Apple Watch 18
  19. 19. Apple Watch 19
  20. 20. Collecting the Same Data Over and Over 20
  21. 21. Open Data 21
  22. 22. Make Your Data Public Setting 22
  23. 23. Data for Everyone 23
  24. 24. Data For Everyone Library 24
  25. 25. Data for Everyone 25
  26. 26. Data For Everyone 26
  27. 27. Open Data API 27
  28. 28. URL Categorization 28
  29. 29. Categorize URLs 29
  30. 30. Record Data 30
  31. 31. Extracting Names and Titles 31
  32. 32. Summarization 32
  33. 33. Is an Image Funny? 33
  34. 34. Classifying Medical Images 34
  35. 35. Attributes of People 35
  36. 36. 36
  37. 37. 396 Scripts 37
  38. 38. Lukas Biewald lukas@crowdflower.com @L2K Thank You

No comments:

Post a Comment