Script 583: Scripts Campaign Dims v2

Purpose

The Python script processes campaign data to aggregate and deduplicate information based on specific criteria.

To Elaborate

The script is designed to manage and process campaign data by aggregating key metrics such as publication cost, clicks, impressions, and video views for each campaign within specified date ranges. It utilizes two data frames derived from an input data source to perform these operations. The script filters data based on campaign names and date ranges, calculates the sum of relevant metrics, and updates the primary data frame with these aggregated values. Finally, it removes duplicate entries based on the campaign column to ensure each campaign is represented uniquely in the output. This process is crucial for generating accurate reports and insights into campaign performance over time.

Walking Through the Code

  1. Data Preparation
    • The script begins by defining the primary data source and selecting relevant columns to create two separate data frames, data_df1 and data_df2.
    • These data frames are then converted into df1 and df2 using the pandas library.
  2. Data Conversion and Filtering
    • The script converts date columns in df1 and df2 to datetime objects to facilitate date-based filtering.
    • It iterates over each row in df1, filtering df2 to match campaigns and date ranges specified in df1.
  3. Aggregation and Update
    • For each campaign, the script calculates the sum of publication cost, clicks, impressions, and video views from the filtered data in df2.
    • These aggregated values are then assigned back to the corresponding rows in df1.
  4. Deduplication
    • The script removes duplicate entries from df1 based on the campaign column to ensure each campaign is uniquely represented.
    • The final deduplicated data frame is printed, providing a concise summary of campaign performance metrics.

Vitals

  • Script ID : 583
  • Client ID / Customer ID: 1306926843 / 60270139
  • Action Type: Bulk Upload (Preview)
  • Item Changed: Campaign
  • Output Columns: Account, Campaign, CPM Imp Target, Target (Impr/Spend/Views), Goal, Pacing - Start Date, Pacing - End Date
  • Linked Datasource: M1 Report
  • Reference Datasource: None
  • Owner: Jesus Garza (jgarza@marinsoftware.com)
  • Created by Jesus Garza on 2023-12-13 16:07
  • Last Updated by Jesus Garza on 2023-12-13 23:09
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
##
## name: Scripts -  Campaign Dims v2
## description:
##  
## 
## author: 
## created: 2023-12-13
## 

today = datetime.datetime.now(CLIENT_TIMEZONE).date()

# primary data source and columns
inputDf = dataSourceDict["1"]
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_DATE = 'Date'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_CPM_IMP_TARGET = 'CPM Imp Target'
RPT_COL_TARGET_IMPR_PER_SPENDVIEWS = 'Target (Impr/Spend/Views)'
RPT_COL_PACING__START_DATE = 'Pacing - Start Date'
RPT_COL_PACING__END_DATE = 'Pacing - End Date'
RPT_COL_GOAL = 'Goal'
RPT_COL_PUB_COST = 'Pub. Cost $'
RPT_COL_CLICKS = 'Clicks'
RPT_COL_IMPR = 'Impr.'
RPT_COL_VIDEO_VIEWS = 'Video Views'

# output columns and initial values
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_CPM_IMP_TARGET = 'CPM Imp Target'
BULK_COL_GOAL = 'Goal'
BULK_COL_PACING__END_DATE = 'Pacing - End Date'
BULK_COL_PACING__START_DATE = 'Pacing - Start Date'
BULK_COL_TARGET_IMPR_PER_SPENDVIEWS = 'Target (Impr/Spend/Views)'
outputDf[BULK_COL_CPM_IMP_TARGET] = "<<YOUR VALUE>>"
outputDf[BULK_COL_GOAL] = "<<YOUR VALUE>>"
outputDf[BULK_COL_PACING__END_DATE] = "<<YOUR VALUE>>"
outputDf[BULK_COL_PACING__START_DATE] = "<<YOUR VALUE>>"
outputDf[BULK_COL_TARGET_IMPR_PER_SPENDVIEWS] = "<<YOUR VALUE>>"

# Create DataFrame 1
data_df1 = inputDf[[RPT_COL_DATE,RPT_COL_CAMPAIGN, RPT_COL_ACCOUNT, RPT_COL_CPM_IMP_TARGET, RPT_COL_TARGET_IMPR_PER_SPENDVIEWS, RPT_COL_PACING__START_DATE, RPT_COL_PACING__END_DATE, RPT_COL_GOAL, RPT_COL_CLICKS, RPT_COL_IMPR]].copy()

# Create DataFrame 2
data_df2 = inputDf[[RPT_COL_DATE, RPT_COL_CAMPAIGN, RPT_COL_PUB_COST, RPT_COL_CLICKS, RPT_COL_IMPR, RPT_COL_VIDEO_VIEWS]]

# Create DataFrame 1 and DataFrame 2
df1 = pd.DataFrame(data_df1)
df2 = pd.DataFrame(data_df2)

# Convert 'Pacing - Start Date' and 'Pacing - End Date' columns to datetime
df1[RPT_COL_PACING__START_DATE] = pd.to_datetime(df1[RPT_COL_PACING__START_DATE])
df1[RPT_COL_PACING__END_DATE] = pd.to_datetime(df1[RPT_COL_PACING__END_DATE])
df2[RPT_COL_DATE] = pd.to_datetime(df2[RPT_COL_DATE])

# Populate 'Pub. Cost $', 'Clicks', and 'Impr.' in df1 with the sum of values from df2 for each campaign
for index, row in df1.iterrows():
    campaign = row[RPT_COL_CAMPAIGN]
    start_date = row[RPT_COL_PACING__START_DATE]
    end_date = row[RPT_COL_PACING__END_DATE]
    
    # Filter df2 for the specific campaign and within the date range
    filtered_data = df2[(df2[RPT_COL_CAMPAIGN] == campaign) & (df2[RPT_COL_DATE] >= start_date) & (df2[RPT_COL_DATE] <= end_date)]
    
    # Calculate the sum of 'Pub. Cost $', 'Clicks', and 'Impr.' and assign them to df1
    total_cost = filtered_data[RPT_COL_PUB_COST].sum()
    total_clicks = filtered_data[RPT_COL_CLICKS].sum()
    total_impressions = filtered_data[RPT_COL_IMPR].sum()
    total_views = filtered_data[RPT_COL_VIDEO_VIEWS].sum()

    df1.loc[index, RPT_COL_PUB_COST] = total_cost
    df1.loc[index, RPT_COL_CLICKS] = total_clicks
    df1.loc[index, RPT_COL_IMPR] = total_impressions
    df1.loc[index, RPT_COL_VIDEO_VIEWS] = total_views

# Drop duplicates based on the 'Campaign' column
df1_deduplicated = df1.drop_duplicates(subset=[RPT_COL_CAMPAIGN])

print(df1_deduplicated)

Post generated on 2024-11-27 06:58:46 GMT

comments powered by Disqus