Machine Learning Techniques for Analyzing Unstructured Business Data
Share this Session:
  Nick Pendar   Nick Pendar
NLP Data Scientist
Skytree
 


 

Thursday, August 21, 2014
09:30 AM - 10:00 AM

Level:  Technical - Introductory


Many business applications, like fraud analysis, hardware maintenance, healthcare, and insurance for example, deal with extremely large unstructured human-generated textual datasets. Tapping into this wealth of information continues to be very challenging due to the inherent creativity and ambiguity of human language. The relevant information often manifests itself as the relationships among a subset of documents and is not visible in any individual record. This presentation provides an overview of the most interesting analytic problems we are trying to solve with Machine Learning, most of which contain a heavy textual component alongside other structured information. Nick Pendar, NLP Data Scientist at Skytree, will also demonstrate some of the capabilities we are building to leverage Skytree’s core Machine Learning and analytic algorithms at large scale.


As a natural language processing (NLP) expert, Nick Pendar applies machine learning and data mining techniques to textual data in order to classify, extract and organize information from a variety of sources. Nick received his Ph.D. from the University of Toronto in 2005, and in the same year started an academic position at Iowa State University, where he conducted and directed research on NLP and text categorization for various educational and legal purposes. Prior to joining Skytree as a NLP Data Scientist, Nick also held engineering and R&D positions at Groupon, Uptake and H5. He has published papers and given numerous talks on the topic of NLP to a variety of audiences for over 14 years; he has also filed multiple patents, and is an active member of several related professional organizations and conferences.


   
Close Window