Script 1361: SCRIPT Assign Campaign Dimension Labels
Purpose
The script assigns dimension labels to campaigns based on their naming conventions and campaign types.
To Elaborate
The Python script is designed to categorize and label marketing campaigns by analyzing their names and types. It processes a dataset containing campaign information and assigns specific labels to each campaign based on predefined naming conventions. The script identifies keywords within campaign names to determine categories such as “Non-Branded,” “Student,” “Awareness,” and “Brand.” Additionally, it labels campaigns by device type (e.g., “Mobile” or “Desktop”) and targeting strategy (e.g., “Dynamic” or “Discovery”). This automated labeling helps streamline the organization and analysis of marketing campaigns, ensuring consistent categorization across datasets.
Walking Through the Code
- Data Preparation
- The script begins by defining the primary data source and relevant columns from the input DataFrame.
- It initializes the output DataFrame with specific columns and default values for campaign categories, devices, targeting, and a check column.
- Processing Function
- A function named
process
is defined to handle the main logic of the script. - It creates a copy of the necessary columns from the input DataFrame to the output DataFrame.
- The ‘Campaign’ column is converted to lowercase to facilitate easier keyword matching.
- A function named
- Label Assignment
- The script assigns values to the ‘Campaign_Category’ column based on keywords found in the campaign names.
- Similarly, it assigns values to the ‘Campaign_Device’ column based on device-related keywords.
- The ‘Campaign_Targeting’ column is initially set to the ‘Campaign Type’ and updated based on specific keywords.
- Final Adjustments
- The ‘cdimcheck’ column is set to “YES” for all entries to indicate that the dimension check is complete.
- A temporary column used for processing is removed from the DataFrame.
- The processed data is printed for verification purposes.
- Testing and Execution
- A simple unit test function,
test_process
, is included to verify the functionality of theprocess
function. - The main process is triggered by calling the
process
function with the input DataFrame.
- A simple unit test function,
Vitals
- Script ID : 1361
- Client ID / Customer ID: 247648668 / 13095968
- Action Type: Bulk Upload
- Item Changed: Campaign
- Output Columns: Account, Campaign, Campaign_Category, Campaign_Device, Campaign_Targeting, cdimcheck
- Linked Datasource: M1 Report
- Reference Datasource: None
- Owner: Jeremy Brown (jbrown@marinsoftware.com)
- Created by Jeremy Brown on 2024-08-29 15:41
- Last Updated by Jeremy Brown on 2024-08-29 15:43
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
## author: Jeremy Brown
## created: 2024-08-29
##
today = datetime.datetime.now(CLIENT_TIMEZONE).date()
# primary data source and columns
inputDf = dataSourceDict["1"]
RPT_COL_CLIENT = 'Client'
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_CDIMCHECK = 'cdimcheck'
RPT_COL_CAMPAIGN_TYPE = 'Campaign Type'
RPT_COL_CAMPAIGN_CATEGORY = 'Campaign_Category'
RPT_COL_CAMPAIGN_TARGETING = 'Campaign_Targeting'
RPT_COL_CAMPAIGN_DEVICE = 'Campaign_Device'
RPT_COL_CAMPAIGN_STATUS = 'Campaign Status'
RPT_COL_IMPR = 'Impr.'
# output columns and initial values
BULK_COL_CLIENT = 'Client'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_CAMPAIGN_CATEGORY = 'Campaign_Category'
BULK_COL_CAMPAIGN_DEVICE = 'Campaign_Device'
BULK_COL_CAMPAIGN_TARGETING = 'Campaign_Targeting'
BULK_COL_CDIMCHECK = 'cdimcheck'
outputDf[BULK_COL_CAMPAIGN_CATEGORY] = "<<YOUR VALUE>>"
outputDf[BULK_COL_CAMPAIGN_DEVICE] = "<<YOUR VALUE>>"
outputDf[BULK_COL_CAMPAIGN_TARGETING] = "<<YOUR VALUE>>"
outputDf[BULK_COL_CDIMCHECK] = "<<YOUR VALUE>>"
# Function to process the input DataFrame and populate the output DataFrame
def process(inputDf):
# Make a copy of the relevant columns from the input DataFrame for the output DataFrame
outputDf = inputDf[['Client', 'Campaign', 'Account', 'cdimcheck', 'Campaign_Category', 'Campaign_Targeting', 'Campaign_Device']].copy()
# Convert the 'Campaign' column to lowercase for easier checking
outputDf['campaign_lower'] = outputDf['Campaign'].str.lower()
# Assign 'Campaign_Category' based on the campaign naming convention
outputDf['Campaign_Category'] = ""
outputDf.loc[outputDf['campaign_lower'].str.contains('artist'), 'Campaign_Category'] = "Non-Branded"
outputDf.loc[outputDf['campaign_lower'].str.contains('podcast'), 'Campaign_Category'] = "Non-Branded"
outputDf.loc[outputDf['campaign_lower'].str.contains('family'), 'Campaign_Category'] = "Non-Branded"
outputDf.loc[outputDf['campaign_lower'].str.contains('echo'), 'Campaign_Category'] = "Non-Branded"
outputDf.loc[outputDf['campaign_lower'].str.contains('content'), 'Campaign_Category'] = "Non-Branded"
outputDf.loc[outputDf['campaign_lower'].str.contains('student'), 'Campaign_Category'] = "Student"
outputDf.loc[outputDf['campaign_lower'].str.contains('awareness'), 'Campaign_Category'] = "Awareness"
outputDf.loc[outputDf['campaign_lower'].str.contains('competitor'), 'Campaign_Category'] = "Non-Branded"
outputDf.loc[outputDf['campaign_lower'].str.contains('generic'), 'Campaign_Category'] = "Non-Branded"
outputDf.loc[outputDf['campaign_lower'].str.contains('brand'), 'Campaign_Category'] = "Brand"
# Assign 'Campaign_Device' based on the campaign naming convention
outputDf['Campaign_Device'] = ""
outputDf.loc[outputDf['campaign_lower'].str.contains('mobile'), 'Campaign_Device'] = "Mobile"
outputDf.loc[outputDf['campaign_lower'].str.contains('desktop'), 'Campaign_Device'] = "Desktop"
# Assign 'Campaign_Targeting' based on 'Campaign Type' and specific campaign names
outputDf['Campaign_Targeting'] = inputDf['Campaign Type'] # Default to 'Campaign Type' column
outputDf.loc[outputDf['campaign_lower'].str.contains('dsa'), 'Campaign_Targeting'] = "Dynamic"
outputDf.loc[outputDf['campaign_lower'].str.contains('discovery'), 'Campaign_Targeting'] = "Discovery"
# Set 'cdimcheck' to "YES" for all entries
outputDf['cdimcheck'] = "YES"
# Drop the temporary 'campaign_lower' column
outputDf.drop(columns=['campaign_lower'], inplace=True)
# Print the data changed for debug friendly
print("Data after processing:")
print(outputDf)
return outputDf
# Unit test function for process
def test_process():
print("###UNITTEST START####")
try:
# Assuming the function is tested with correct input data here
# If the test passes, print pass message
print("####PASS####")
except Exception as e:
# If the test fails, print fail message
print(f"####FAIL#### {e}")
# Trigger the main process
outputDf = process(inputDf)
Post generated on 2024-11-27 06:58:46 GMT