Recent Ingenuity

Python Pandas

Power of Python Pandas

Power of Python Pandas The ease of extracting and summarizing large amounts of data using Python Pandas is powerful.  Below is an example of using airline data to find out how many passengers went to an airport, accident rate based on reference codes, deaths and the causes of accidents.  With a few lines of code,
+ Read More

Extract User Reviews using Python Pandas

Extract User Reviews using Python Pandas

Extract User Reviews using Python Pandas TripAdvisor user reviews data about a particular hotel. To help the hotel understand the feedback the reviews provide, and what it might suggest they should focus on to improve customer experience.  In part I, data will be extracted for each reviewer’s ratings of a hotel along with a summary.  The
+ Read More

Python Relational Database

Python Relational Database

Python Relational Database In this example, two data “assets” will be created to be used by a company for a direct marketing campaigns. The raw data will be used to create a python relational database to create a “flat” file with selected customers and variables. Examples of the summary output will be provided along with
+ Read More

Python Data Types

Python Data Types

Python Data Types:  Putting Airline Data In Order The data used in the python data types example is from OpenFlights.org and contains three data files, one for airports, one for routes and one for airlines. The data are for up to January 2012. The data in the file airports.dat looks like this: 1,”Goroka”,”Goroka”,”Papua New Guinea”,”GKA”,”AYGA”,-6.081689,145.391881,5282,10,”U”,”Pacific/Port_Moresby” 2,”Madang”,”Madang”,”Papua New Guinea”,”MAG”,”AYMD”,-5.207083,145.7887,20,10,”U”,”Pacific/Port_Moresby” 3,”Mount
+ Read More

Survey Design & Implementation of Top Level Domains

Survey Design for Top Level Domains

Survey Design and Implementation for New Top Level Domains (TLDs) Abstract Is the new Top Level Domain (TLD) program a success or are new top level domain names being purchased by industry insiders.  Are industry outsiders aware of the new TLDs and if so, would they consider registering a new top level domain name.  Nobody
+ Read More

Predictive Analytics

The Power to Predict Who Will Click, Buy, Lie, or Die

Cluster Analysis Average Distance between Cluster

Cluster Analysis on Transformed Variables

Cluster Analysis on Transformed Predictor Variables Cluster analysis is grouping a set of objects in a way that objects in the same group are more similar in some sense to each other than those in other groups.  Clusters are identified by assessing the relative distances between points, the relative homogeneity of each cluster and the degree
+ Read More

1 2