Script 583: Scripts Campaign Dims v2
Purpose
Python script to summarize and deduplicate campaign data.
To Elaborate
The Python script takes campaign data as input and performs the following tasks:
- Creates two dataframes, df1 and df2, from the input data.
- Converts the ‘Pacing - Start Date’ and ‘Pacing - End Date’ columns in df1 to datetime format.
- Populates the ‘Pub. Cost $’, ‘Clicks’, and ‘Impr.’ columns in df1 with the sum of values from df2 for each campaign.
- Drops duplicates based on the ‘Campaign’ column in df1.
- Prints the deduplicated dataframe, df1_deduplicated.
Walking Through the Code
- The script starts by importing the necessary libraries and defining constants.
- It creates a dataframe, df1, by selecting specific columns from the input data.
- It creates another dataframe, df2, by selecting different columns from the input data.
- The script converts the ‘Pacing - Start Date’ and ‘Pacing - End Date’ columns in df1 to datetime format.
- It iterates over each row in df1 and performs the following steps:
- Retrieves the campaign, start date, and end date from the row.
- Filters df2 for the specific campaign and within the date range.
- Calculates the sum of ‘Pub. Cost $’, ‘Clicks’, ‘Impr.’, and ‘Video Views’ for the filtered data.
- Assigns the calculated values to the corresponding columns in df1.
- After iterating through all rows in df1, the script drops duplicates based on the ‘Campaign’ column and assigns the result to df1_deduplicated.
- Finally, it prints the deduplicated dataframe, df1_deduplicated.
Vitals
- Script ID : 583
- Client ID / Customer ID: 1306926843 / 60270139
- Action Type: Bulk Upload (Preview)
- Item Changed: Campaign
- Output Columns: Account, Campaign, CPM Imp Target, Target (Impr/Spend/Views), Goal, Pacing - Start Date, Pacing - End Date
- Linked Datasource: M1 Report
- Reference Datasource: None
- Owner: Jesus Garza (jgarza@marinsoftware.com)
- Created by Jesus Garza on 2023-12-13 16:07
- Last Updated by Jesus Garza on 2023-12-13 23:09
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
##
## name: Scripts - Campaign Dims v2
## description:
##
##
## author:
## created: 2023-12-13
##
today = datetime.datetime.now(CLIENT_TIMEZONE).date()
# primary data source and columns
inputDf = dataSourceDict["1"]
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_DATE = 'Date'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_CPM_IMP_TARGET = 'CPM Imp Target'
RPT_COL_TARGET_IMPR_PER_SPENDVIEWS = 'Target (Impr/Spend/Views)'
RPT_COL_PACING__START_DATE = 'Pacing - Start Date'
RPT_COL_PACING__END_DATE = 'Pacing - End Date'
RPT_COL_GOAL = 'Goal'
RPT_COL_PUB_COST = 'Pub. Cost $'
RPT_COL_CLICKS = 'Clicks'
RPT_COL_IMPR = 'Impr.'
RPT_COL_VIDEO_VIEWS = 'Video Views'
# output columns and initial values
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_CPM_IMP_TARGET = 'CPM Imp Target'
BULK_COL_GOAL = 'Goal'
BULK_COL_PACING__END_DATE = 'Pacing - End Date'
BULK_COL_PACING__START_DATE = 'Pacing - Start Date'
BULK_COL_TARGET_IMPR_PER_SPENDVIEWS = 'Target (Impr/Spend/Views)'
outputDf[BULK_COL_CPM_IMP_TARGET] = "<<YOUR VALUE>>"
outputDf[BULK_COL_GOAL] = "<<YOUR VALUE>>"
outputDf[BULK_COL_PACING__END_DATE] = "<<YOUR VALUE>>"
outputDf[BULK_COL_PACING__START_DATE] = "<<YOUR VALUE>>"
outputDf[BULK_COL_TARGET_IMPR_PER_SPENDVIEWS] = "<<YOUR VALUE>>"
# Create DataFrame 1
data_df1 = inputDf[[RPT_COL_DATE,RPT_COL_CAMPAIGN, RPT_COL_ACCOUNT, RPT_COL_CPM_IMP_TARGET, RPT_COL_TARGET_IMPR_PER_SPENDVIEWS, RPT_COL_PACING__START_DATE, RPT_COL_PACING__END_DATE, RPT_COL_GOAL, RPT_COL_CLICKS, RPT_COL_IMPR]].copy()
# Create DataFrame 2
data_df2 = inputDf[[RPT_COL_DATE, RPT_COL_CAMPAIGN, RPT_COL_PUB_COST, RPT_COL_CLICKS, RPT_COL_IMPR, RPT_COL_VIDEO_VIEWS]]
# Create DataFrame 1 and DataFrame 2
df1 = pd.DataFrame(data_df1)
df2 = pd.DataFrame(data_df2)
# Convert 'Pacing - Start Date' and 'Pacing - End Date' columns to datetime
df1[RPT_COL_PACING__START_DATE] = pd.to_datetime(df1[RPT_COL_PACING__START_DATE])
df1[RPT_COL_PACING__END_DATE] = pd.to_datetime(df1[RPT_COL_PACING__END_DATE])
df2[RPT_COL_DATE] = pd.to_datetime(df2[RPT_COL_DATE])
# Populate 'Pub. Cost $', 'Clicks', and 'Impr.' in df1 with the sum of values from df2 for each campaign
for index, row in df1.iterrows():
campaign = row[RPT_COL_CAMPAIGN]
start_date = row[RPT_COL_PACING__START_DATE]
end_date = row[RPT_COL_PACING__END_DATE]
# Filter df2 for the specific campaign and within the date range
filtered_data = df2[(df2[RPT_COL_CAMPAIGN] == campaign) & (df2[RPT_COL_DATE] >= start_date) & (df2[RPT_COL_DATE] <= end_date)]
# Calculate the sum of 'Pub. Cost $', 'Clicks', and 'Impr.' and assign them to df1
total_cost = filtered_data[RPT_COL_PUB_COST].sum()
total_clicks = filtered_data[RPT_COL_CLICKS].sum()
total_impressions = filtered_data[RPT_COL_IMPR].sum()
total_views = filtered_data[RPT_COL_VIDEO_VIEWS].sum()
df1.loc[index, RPT_COL_PUB_COST] = total_cost
df1.loc[index, RPT_COL_CLICKS] = total_clicks
df1.loc[index, RPT_COL_IMPR] = total_impressions
df1.loc[index, RPT_COL_VIDEO_VIEWS] = total_views
# Drop duplicates based on the 'Campaign' column
df1_deduplicated = df1.drop_duplicates(subset=[RPT_COL_CAMPAIGN])
print(df1_deduplicated)
Post generated on 2024-05-15 07:44:05 GMT