Coding Musings

Trying to share some code ideas

logo

Address Verification API

Where do you live? May be a simple question, but when companies are asking where someone lives, the answer isn’t always so easy. The physical address and the mailing address of a person can be two completely separate things. And when companies want to send direct mail, they need to ensure the message is appropriate.

Today I am going to show a Python program that I wrote that can process millions of addresses and verify if the address is a valid mailing address that matches the physical address. And how the majority of online mapping services get it wrong.

Column Names by Table with Row Counts

How big is it? Big? Really Big? Ginormous???

Yesterday, I posted a quick query Column Names by Table on how to search for a column in any table in the database. Today’s query will extend that functionality to let you know how many rows are in the table while avoiding the dreaded

SELECT   COUNT(*)   FROM dbo.SuperHugeTable;

Column Names by Table

Where is it? Seriously, where is it?

We’ve all had the experience of trying to locate a specific column name or wanting to see the meta-data for columns in a specific or multiple tables. By reading the information schema tables we can see the meta-data and find all occurences of the column in the database. This information can be quite useful when needing to join columns and finding some hidden relationships.

Add a Space Before Upper Case Letter

“Why doesn’t this function exist?”

We have all had the situation where there is a need for a little custom function to modify strings or do simple math on a column. Today I am going to show how to create a User Defined Function (UDF) in Microsoft SQL.

Lyrical Success

I’m not gonna write you a love songSara Bareilles

Post 1 of 4

A question I’ve always wondered is their a magic formula for creating a number one song. Many song writers are prolific however their song doesn’t necessarily have commercial success. It may be difficult to quantify how the musical composition relates to the song’s success. However I am going to make an attempt at evaluating the lyrics of songs to determining if we can accurately predict if a song will be commercially successful.

Lyrical Success – Getting the Data

To obtain the data I am going to use Beautiful Soup and a few other packages to scrap the content from the websites.

Part 2 of 4

Steps for Getting the Data

  1. Get the Songs Made by the Artists
  2. Extract the List of Songs by the Artist
  3. Scrap the Lyrics for Each Song by the Artist
  4. Extract the Lyrics from Each File
  5. Scrap Rankings by Artists
  6. Parse the Song Rankings Files

The first step is getting the list of the songs by the artist. I am using the website http://www.azlyrics.com to obtain the list of songs and the lyrics for the songs.

Lyrical Success – Preparing the Data

The data created in the previous step needs a little bit of cleaning up before we can get into the model building. It is a common step that needs to be undertaken to ensure that the data can be loaded into models without any issues.

Part 3 of 4

Steps for Preparing the Data

  1. Clean the Rankings
  2. Match the Song Ranks

Lyrical Success – Model Prediction

In this step, I will look at the data to see if we can do any feature engineering. And then I will edit the data for the model, train multiple models, evaluate the best model and then test the model. Let’s get started.

Part 4 of 4

Steps for Creating the Model

  1. Song Summary
  2. Visualizations
  3. Prepare the Lyrics for Analysis
  4. Model Building

Computing the Minimum Number of Flight Segments

Computing the Minimum Number of Flight Segments

If we want to compute the minimum number of flight segments between a starting city and target city, we can construct an undirected graph.  In the graph the nodes represent cities and the edges represent the flight segments.  We can count the number of segments to determine the shortest distance.

The following can be applied to any situation in finding the shortest path.  It is an implementation of the breadth first search algorithm.

See code below.

Modeling and Prediction for Movie Audience Ratings

Modeling and Prediction for Movie Audience Ratings

Synopsis

One of the key issues facing film production companies is, will the production company make a profit from a movie. It is assumed that favorable audience reviews will in-turn lead to higher ticket sales or DVD sales, both items directly affect a movie’s profitability.

The analysis will look at what attributes lead to a higher average audience review score on the public website, Rotten Tomatoes

Spoiler Alert The analysis creates a model that is close, but isn’t 100% confident.

Page 1 of 4

Powered by WordPress & Theme by Anders Norén