Содержание
- 2. About myself Who Faisal Ahmed TalTech Where Narva College Research Communication and Software Engineering Contact Faisal.Ahmed@ut.ee
- 3. Data?! Neighbor's name A place they consider home Are they working at a company now? How
- 4. Data! Neighbor's name A place they consider home Are they working at a company now? How
- 5. Data! Neighbor's name A place they consider home Are they working at a company now? How
- 6. Data Science concerns Is "Data Science" important or just trendy?
- 7. Hmmm… Data Science concerns
- 8. the companies are expanding as fast as the data!
- 9. There's certainly a lot of it! 2015 1 Zettabyte 1 Exabyte 1 Petabyte (brain) 14 PB:
- 10. data information knowledge wisdom I'd call it data, not information
- 11. Big Data? I agree with this…
- 12. Make data easier to use ~ by using it! It may be true that Data Science
- 13. IST 380 ~ the big picture What? Why? Data Science Programming Data Rules All of our
- 14. A few examples… Make3d How is this being done? Andrew Ng ~ Computers and Thought award,
- 15. A few examples… … Data Science is at the heart of computer science Stanford's Autonomous Vehicles
- 16. A few examples… … Data Science is at the heart of computer science "my summer was
- 17. A few examples… Learning ground from obstacles classification segmentation
- 18. Insights beyond science
- 19. Marketing
- 20. Visualization Motivation
- 22. Recommender Systems predicting movie ratings
- 23. Bob Bell, winner of the "Netflix prize" Napoleon Dynamite = Batman Begins = Netflix Prize Finding
- 24. Bob Bell, winner of the "Netflix prize" (I don't know this guy) Napoleon Dynamite = Batman
- 25. Why IST 380 ? Specific skills: R statistical environment (and the S programming language) Experience with
- 26. Why IST 380 ? Specific skills: Broad background: You'll be confident and capable with whatever datasets
- 27. About IST 380 …
- 28. Details Web Page: http://www.cs.hmc.edu/~dodds/IST380 Assignments, online text, necessary files, lecture slides are linked First week's assignment:
- 29. Homepage http://www.cs.hmc.edu/~dodds/IST380/ Go to the course page Grab R and the text from these two links…
- 30. Homework Assignments ~ 2-5 problems/week ~ 100 points extra credit, often Due Tuesday of the following
- 31. Homework Working on programs: On your own or in groups of 2. Divide the work at
- 32. Outline Weeks 1-5 using R descriptive statistics predictive statistics probability distributions Weeks 6-10 "Data Science" "Machine
- 33. Grading Grades Final project if score >= 0.95: grade = "A" if score >= 0.90: grade
- 34. Academic Honesty This course operates under CGU's (and all of Claremont Schools') Academic Honesty policies… Your
- 35. Thoughts?
- 36. Getting to know… R
- 37. Getting to know… R http://lang-index.sourceforge.net/#categ R is the programmer's toolkit for statistics; SAS, Stata, SPSS are
- 38. Getting to know… R Free… and very well supported online…
- 39. Getting to know… R R is responsive, up-to-date, and flexible: Data Science vs. Statistics
- 40. Getting to know… R 1) Find the IST 380 course webpage Try it! www.cs.hmc.edu/~dodds/IST380/ 2) Download
- 41. Getting started! 1) Open Matloff's Why R? notes 2) Skip ahead to page 7, the "5
- 42. Saving your session 2) Use the Save to file… (Windows) or Save as… (Mac) in order
- 43. Submitting your work 2) From the course webpage, click on the submission site link. You've completed
- 44. Reflection Average and standard deviation? Assignment? Comments? Printing? Comments? Creating a vector?
- 45. R types You can use mode() to view the type of a variable.
- 46. Where's the big data? Vectors are R lists of a single type of element c ~
- 47. Where's the big data? Vectors are R lists of a single type of element c ~
- 48. Analyzing vectors – try these… Square brackets [] can "subset" (or "slice") vectors
- 49. Analyzing vectors Square brackets [] can "subset" (or "slice") vectors you can use a boolean vector
- 50. NA R uses NA to represent data that is "not available" What is going on here?
- 51. NA R uses NA to represent data that is "not available" What is going on here?
- 52. Data frames R's fundamental data structures are data frames The next tutorial will introduce them…
- 53. Irises… setosa virginica data() yields many built-in data files. This is iris
- 54. Subsetting iris data As with vectors, you can "subset" data frames. df[rows,cols]
- 55. Lab… The 2nd part of each class meeting dedicated to lab work. I welcome you to
- 56. Homework Problem 3: Challenge exercises in R These will reinforce the "subsetting" and data-analysis introduction from
- 57. Lab !
- 58. CS vs. IS and IT ? www.acm.org/education/curric_vols/CC2005_Final_Report2.pdf greater integration system-wide issues smaller details machine specifics
- 59. CS vs. IS and IT ? Where will IS go?
- 60. CS vs. IS and IT ?
- 61. IT ? Where will IT go?
- 62. IT ?
- 64. The bigger picture Weeks 10-12 Objects Week 10 Week 11 Week 12 Weeks 13-15 Final Projects
- 65. Data?! Neighbor's name A place they consider home Are they working at a company now? How
- 66. state reminders…
- 67. Data! Neighbor's name A place they consider home Are they working at a company now? How
- 69. Скачать презентацию