Script 1597: Script Autotag Campaign Program
Purpose:
The Python script processes campaign data to extract and map program codes to their full names, filtering out entries where this mapping fails.
To Elaborate
The script is designed to process a dataset containing campaign information, specifically focusing on extracting a program code embedded within the campaign names. It maps these codes to their corresponding full program names using a predefined dictionary. The primary goal is to transform the campaign data by adding a new column that contains the full program names, which are derived from the campaign names. If the extraction and mapping process fails for any campaign, those entries are removed from the final output. This ensures that the resulting dataset only includes campaigns with successfully identified and mapped program names, facilitating better analysis and reporting of campaign performance by program.
Walking Through the Code
- Data Preparation:
- The script begins by selecting the relevant columns from the input data source, specifically the ‘Account’ and ‘Campaign’ columns, and creates a new DataFrame for output.
- The column names in the output DataFrame are standardized to ‘Account’ and ‘Campaign’.
- Program Code Mapping:
- A dictionary named
PROGRAM_MAPPING
is defined to map short program codes to their full names. - The script includes a function
extract_and_map_program
that takes a campaign name, splits it by underscores, and attempts to extract the program code located between the second and third underscores. - This extracted code is then mapped to its full name using the
PROGRAM_MAPPING
dictionary. If the code is not found in the dictionary or if the extraction fails, the function returnsNone
.
- A dictionary named
- Data Transformation:
- The script applies the
extract_and_map_program
function to the ‘Campaign’ column of the output DataFrame, creating a new column ‘Program_Dim’ with the mapped full names. - Rows where the program extraction and mapping result in
None
are removed from the DataFrame, ensuring only valid entries are retained.
- The script applies the
- Output:
- The script concludes by printing the first few rows of both the input and output DataFrames for verification purposes.
Vitals
- Script ID : 1597
- Client ID / Customer ID: 1306916369 / 2
- Action Type: Bulk Upload (Preview)
- Item Changed: Campaign
- Output Columns: Account, Campaign, Program_Dim
- Linked Datasource: M1 Report
- Reference Datasource: None
- Owner: Grégory Pantaine (gpantaine@marinsoftware.com)
- Created by Grégory Pantaine on 2024-12-23 20:53
- Last Updated by Grégory Pantaine on 2024-12-23 20:59
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
# primary data source and columns
inputDf = dataSourceDict["1"]
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_CAMPAIGN_STATUS = 'Campaign Status'
RPT_COL_CAMPAIGN_TYPE = 'Campaign Type'
RPT_COL_IMPR = 'Impr.'
RPT_COL_CLICKS = 'Clicks'
RPT_COL_CTR = 'CTR %'
RPT_COL_DAILY_BUDGET = 'Daily Budget'
RPT_COL_LAST_MODIFIED_DATE = 'Last Modified Date'
RPT_COL_RECOMMENDED_DAILY_BUDGET = 'Recommended Daily Budget'
RPT_COL_PUB_COST = 'Pub. Cost £'
RPT_COL_CAMPAIGN_CREATION_DATE = 'Campaign Creation Date'
RPT_COL_PUB_ID = 'Pub. ID'
RPT_COL_CAMPAIGN_OVERRIDE = 'Campaign Override'
RPT_COL_NETWORK = 'Network'
RPT_COL_NETWORK_TARGETING = 'Network Targeting'
RPT_COL_PUBLISHER = 'Publisher'
RPT_COL_PUBLISHER_BID_CAP = 'Publisher Bid Cap'
RPT_COL_PUBLISHER_BID_PORTFOLIO_NAME = 'Publisher Bid Portfolio Name'
RPT_COL_PUBLISHER_BID_STRATEGY = 'Publisher Bid Strategy'
RPT_COL_PUBLISHER_BID_STRATEGY_LEVEL = 'Publisher Bid Strategy Level'
RPT_COL_PUBLISHER_NAME = 'Publisher Name'
RPT_COL_PUBLISHER_TARGET_IS = 'Publisher Target IS %'
RPT_COL_PUBLISHER_TARGET_IS_TYPE = 'Publisher Target IS Type'
RPT_COL_PUBLISHER_TARGET_ROAS = 'Publisher Target ROAS'
RPT_COL_PUBLISHER_TARGET_CPA = 'Publisher Target CPA'
RPT_COL_COUNTRY_DIM = 'Country_Dim'
RPT_COL_FRIENDLYNAME_DIM = 'FriendlyName_Dim'
RPT_COL_LINE_OF_BUSINESS_DIM = 'Line of Business_Dim'
RPT_COL_PROGRAM_DIM = 'Program_Dim'
# output columns and initial values
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_PROGRAM_DIM = 'Program_Dim'
# Program code to full name mapping
PROGRAM_MAPPING = {
'EMX': 'EmeraldEMX',
'CBUS': 'Client Briefings',
'SWPR': 'Solution Webinars',
'PR': 'PR',
'SI': 'Social Influencer',
'AR': 'AR',
'SAL': 'Sales',
'MVERT': 'Merchant Verticals',
'PTNR': 'Partner',
'OB': 'On-boarding',
'CCM': 'Client Communication Merchant',
'RTNK': 'Rethink',
'MENS': 'Merchant Enterprise New Sales',
'MECS': 'Merchant Enterprise Cross Sales',
'MEA': 'Merchant Enterprise All',
'MNS': 'Merchant SMB New Sales',
'PM': 'Paid Media'
}
# Function to extract value between second and third underscore and map to full name
def extract_and_map_program(campaign_name):
try:
parts = campaign_name.split('_')
if len(parts) >= 3:
program_code = parts[2]
# Return mapped full name if exists, otherwise return original code
return PROGRAM_MAPPING.get(program_code, program_code)
return None # Return None if format doesn't match
except:
return None # Return None for any errors
# Create output dataframe with required columns
outputDf = inputDf[[RPT_COL_ACCOUNT, RPT_COL_CAMPAIGN]].copy()
outputDf.columns = [BULK_COL_ACCOUNT, BULK_COL_CAMPAIGN]
# Apply the extraction and mapping function
outputDf[BULK_COL_PROGRAM_DIM] = outputDf[BULK_COL_CAMPAIGN].apply(extract_and_map_program)
# Remove rows where program extraction failed (where value is None)
outputDf = outputDf.dropna(subset=[BULK_COL_PROGRAM_DIM])
# Debug print first few rows
print(tableize(inputDf.head()))
print("\nOutput DataFrame:")
print(tableize(outputDf.head()))
Post generated on 2025-03-11 01:25:51 GMT