Techniques and Algorithms in Data Science for Big Data
Share this Session:
  Laila Moretto   Laila Moretto
Data Scientist / Adjunct Professor
University of Maryland University College
 


 

Tuesday, March 31, 2015
11:15 AM - 12:00 PM

Level:  Intermediate


Respondents in a 2012 study conducted by Infochimps and SSWUG.org listed “finding talent” and “finding the right tools” as the most significant challenges faced when working with Big Data. An article in Information Week highlighted that two reasons Big Data projects fail are selecting the wrong uses and asking the wrong questions. Yet, most companies are buying tools and hiring consultants without having knowledge of what is involved in data science. A big part of managing Big Data is asking the right questions. How would an organization know what to ask or whether the data scientist it hires is solving its specific problems using data science techniques? This presentation will provide Enterprise Data World attendees with clear techniques and algorithms that will help in avoiding those failures.

This presentation will inform, educate, and above all provide attendees with the confidence they need to hire the right data scientist, know whether the right problems are being addressed, and most importantly prevent failures that can emerge from asking the “wrong questions.” While many techniques and algorithms can be used, this presentation will focus on the main techniques and algorithms needed to do data science. The presenter will cover the following topics in detail:

  • Cluster analysis – Big data is complex and using cluster analysis will make it a little easier by grouping data sets according to their similarities.
  • Naïve Bayes – Probabilistic classifier that can be used in sentiment analysis and others.
  • Regression- Various regression techniques used in predictive analysis
  • Correlation – Differences between correlation and causation

The presentation will address the differences between the various algorithms, when algorithms should be used, what kind of questions the algorithms can answer, the most appropriate use cases for the algorithms, and what algorithms works best for specific cases and why.


Dr. Moretto brings her professional and academic expertise to EDW. She is an adjunct professor with the University of Maryland University College Graduate School of Management and Technology and works for the MITRE Corporation. Throughout her twenty year career, Dr. Moretto has directed, managed, consulted, and assisted projects in a wide variety of roles covering all aspects of enterprise architecture, data, and information engineering to include data modeling, data base administration, requirements analysis, and strategic management. Her most recent interest is focused on Data Science, Big Data, and Analytics. Dr. Moretto is an international speaker. She has presented at many conferences including Latin America Data Management in Brazil, EDW, DAMA, Army, and other organizations. Her presentations covered topics such as complex systems, complexity leadership, Metadata, Service Oriented Architecture (SOA) and Web Services, Data Science, and Analytics amongst others. Dr. Moretto holds a doctorate in Human and Organization Learning from The George Washington University. Her dissertation focused on leadership and decision-making in complex environments. She holds a Master's degree in Information Technology from Virginia Tech and a Bachelor of Science degree in Information Systems from the University of Texas at Arlington. Additionally, she holds a Federal CIO certification from Carnegie Mellon University.


   
Close Window