# Processing/Feature Engineering

### In order to "score" an area, I needed to transform my TEDS-D data so that each row represents one CBSA area and I needed to be able to count the total number of clients for the different labels in each column for that area.

### To do this, I one-hot-encoded each column, grouped the data by the CBSA codes, then summed the resulting columns. This then allowed me to treat each label in the original dataset as a variable to use in my score functions.

# The Long List of Variables

### The table below describes all of the variables I am using to generate this score.

I included the sign of each coefficient to distinguish between "good" and "bad" variables.

"Good" variables have a positive impact on the score.

"Bad" variables have a negative impact on the score.

## To shorten some notation,

# Some Score Functions

# Some Score Plots

## Discharge Scores (y) vs. Referral Scores (x)

## Length-Of-Stay Scores (y) vs. Referral Scores (x)

## Length-Of-Stay Scores (y) vs. Discharge Scores (x)

## The Everything Scores