Script 821: Total Pub Cost & Pub Cost Percentage Column

Purpose

The script calculates the total publication cost and its percentage for each ‘SBA Strategy’ and ‘Campaign’ group in a dataset.

To Elaborate

The Python script processes a dataset to compute financial metrics related to publication costs. It specifically focuses on calculating the total publication cost for each group defined by ‘SBA Strategy’ and ‘Campaign’. Additionally, it determines the percentage of each row’s publication cost relative to the total cost within its group. This is useful for understanding the distribution of costs across different strategies and campaigns, allowing for better financial analysis and decision-making. The script ensures that the calculated percentages are rounded to two decimal places for precision and formats them for clarity in the output.

Walking Through the Code

  1. Initialization and Setup: The script begins by loading input data into a DataFrame and defining relevant column names for processing. It initializes columns for total publication cost and percentage of total publication cost to zero.

  2. Calculation of Total Publication Cost: The script calculates the total publication cost for each ‘SBA Strategy’ group using the groupby and transform functions, storing the results in the ‘Total $ Pub. Cost’ column.

  3. Percentage Calculation: It computes the percentage of each row’s publication cost relative to the total cost within its group. This percentage is calculated and rounded to two decimal places for accuracy.

  4. Output Preparation: The script selects necessary columns for output, renames them for clarity, and handles any NaN values by replacing them with “N/A”. It converts the percentage values to strings and appends a ‘%’ symbol to non-“N/A” values for better readability.

  5. Display: Finally, the script prints the processed DataFrame, which includes the total publication cost and percentage for each group.

Vitals

  • Script ID : 821
  • Client ID / Customer ID: 1306926629 / 60270083
  • Action Type: Bulk Upload
  • Item Changed: Campaign
  • Output Columns: Account, Campaign, % of Total Pub. Cost, SBA Strategy, Total $ Pub. Cost
  • Linked Datasource: M1 Report
  • Reference Datasource: None
  • Owner: dwaidhas@marinsoftware.com (dwaidhas@marinsoftware.com)
  • Created by dwaidhas@marinsoftware.com on 2024-03-18 16:46
  • Last Updated by dwaidhas@marinsoftware.com on 2024-04-22 16:54
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
##
## name: Total Pub Cost & Pub Cost Percentage Column 
## description:
##  The script clears existing data, calculates the total pub cost for each 'SBA Strategy' and 'Campaign' group, and computes the percentage of 'Pub. Cost $' for each row relative to the total cost within its group.
## 
## author: Dana Waidhas 
## created: 2024-03-18
## 

# setup timezone
# Chicago Timezone is GMT-5. Adjust as needed.
CLIENT_TIMEZONE = datetime.timezone(datetime.timedelta(hours=-5))

# Load input data
inputDf = dataSourceDict["1"]

# Define column names
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_SBA_STRATEGY = 'SBA Strategy'
RPT_COL_PUB_COST = 'Pub. Cost $'
RPT_COL_TOTAL_PUB_COST = 'Total $ Pub. Cost'
RPT_COL_PCT_OF_TOTAL_PUB_COST = '% of Total Pub. Cost'

# Clear 'Total $ Pub. Cost' and '% of Total Pub. Cost'
inputDf[RPT_COL_TOTAL_PUB_COST] = 0
inputDf[RPT_COL_PCT_OF_TOTAL_PUB_COST] = 0

# Calculate Total $ Pub. Cost for each SBA Strategy group
inputDf[RPT_COL_TOTAL_PUB_COST] = round(inputDf.groupby(RPT_COL_SBA_STRATEGY)[RPT_COL_PUB_COST].transform('sum'), 2)


# Calculate % of Total Pub. Cost for each row
inputDf[RPT_COL_PCT_OF_TOTAL_PUB_COST] = (inputDf[RPT_COL_PUB_COST] / inputDf[RPT_COL_TOTAL_PUB_COST]) * 100

# Round the '% of Total Pub. Cost' column to two decimal places
inputDf[RPT_COL_PCT_OF_TOTAL_PUB_COST] = round(inputDf[RPT_COL_PCT_OF_TOTAL_PUB_COST], 2)

# Define output column names
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_SBA_STRATEGY = 'SBA Strategy'
BULK_COL_TOTAL_PUB_COST = 'Total $ Pub. Cost'
BULK_COL_PCT_OF_TOTAL_PUB_COST = '% of Total Pub. Cost'

# Select required columns for output
outputDf = inputDf[[RPT_COL_CAMPAIGN, RPT_COL_ACCOUNT, RPT_COL_SBA_STRATEGY, RPT_COL_TOTAL_PUB_COST, RPT_COL_PCT_OF_TOTAL_PUB_COST]]

# Rename columns
outputDf.columns = [BULK_COL_CAMPAIGN, BULK_COL_ACCOUNT, BULK_COL_SBA_STRATEGY, BULK_COL_TOTAL_PUB_COST, BULK_COL_PCT_OF_TOTAL_PUB_COST]

# Replace NaN values with "N/A" in the '% of Total Pub. Cost' column
outputDf[BULK_COL_PCT_OF_TOTAL_PUB_COST].fillna('N/A', inplace=True)

# Convert the '% of Total Pub. Cost' column to strings
outputDf[BULK_COL_PCT_OF_TOTAL_PUB_COST] = outputDf[BULK_COL_PCT_OF_TOTAL_PUB_COST].astype(str)

# Append '%' to non-"N/A" values
outputDf[BULK_COL_PCT_OF_TOTAL_PUB_COST] = outputDf[BULK_COL_PCT_OF_TOTAL_PUB_COST].apply(lambda x: x + '%' if x != 'N/A' else 'N/A')

# Display the output dataframe
print(outputDf)

Post generated on 2024-11-27 06:58:46 GMT

comments powered by Disqus