Script 1589: Script Auto Tag Program
Purpose:
The Python script processes campaign data to extract and map program codes to their full names, creating a structured output for further analysis.
To Elaborate
The script is designed to process a dataset containing campaign information, specifically focusing on extracting program codes embedded within campaign names. These codes are then mapped to their corresponding full program names using a predefined dictionary. The primary goal is to create a structured output that includes the account, campaign, and campaign ID, along with the full program name. This structured data can be used for further analysis or reporting purposes. The script ensures that only rows with successfully extracted and mapped program names are included in the final output, thereby maintaining data integrity and relevance.
Walking Through the Code
- Data Preparation:
- The script begins by defining the primary data source and relevant columns from the input dataset.
- It sets up a mapping dictionary (
PROGRAM_MAPPING
) to translate program codes to full names.
- Function Definition:
- A function
extract_and_map_program
is defined to extract the program code from the campaign name and map it to its full name using the dictionary. If the extraction fails or the code is not found in the dictionary, it returnsNone
.
- A function
- DataFrame Creation:
- An output DataFrame is created by copying specific columns (
Account
,Campaign
,Campaign ID
) from the input DataFrame. - The column names are adjusted to match the desired output format.
- An output DataFrame is created by copying specific columns (
- Data Processing:
- The script applies the
extract_and_map_program
function to theCampaign
column to populate a new columnProgram_Dim
with the mapped full program names. - Rows where the program extraction fails (resulting in
None
) are removed from the output DataFrame.
- The script applies the
- Output:
- The script prints the first few rows of both the input and output DataFrames for verification purposes.
Vitals
- Script ID : 1589
- Client ID / Customer ID: 1306927959 / 60270403
- Action Type: Bulk Upload
- Item Changed: Campaign
- Output Columns: Account, Campaign, Program_Dim, Campaign ID
- Linked Datasource: M1 Report
- Reference Datasource: None
- Owner: Grégory Pantaine (gpantaine@marinsoftware.com)
- Created by Grégory Pantaine on 2024-12-20 14:22
- Last Updated by Grégory Pantaine on 2025-01-16 17:22
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
# primary data source and columns
inputDf = dataSourceDict["1"]
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_CAMPAIGN_STATUS = 'Campaign Status'
RPT_COL_CAMPAIGN_TYPE = 'Campaign Type'
RPT_COL_IMPR = 'Impr.'
RPT_COL_CLICKS = 'Clicks'
RPT_COL_CTR = 'CTR %'
RPT_COL_DAILY_BUDGET = 'Daily Budget'
RPT_COL_LAST_MODIFIED_DATE = 'Last Modified Date'
RPT_COL_RECOMMENDED_DAILY_BUDGET = 'Recommended Daily Budget'
RPT_COL_PUB_COST = 'Pub. Cost £'
RPT_COL_CAMPAIGN_CREATION_DATE = 'Campaign Creation Date'
RPT_COL_PUB_ID = 'Pub. ID'
RPT_COL_CAMPAIGN_OVERRIDE = 'Campaign Override'
RPT_COL_NETWORK = 'Network'
RPT_COL_NETWORK_TARGETING = 'Network Targeting'
RPT_COL_PUBLISHER = 'Publisher'
RPT_COL_PUBLISHER_BID_CAP = 'Publisher Bid Cap'
RPT_COL_PUBLISHER_BID_PORTFOLIO_NAME = 'Publisher Bid Portfolio Name'
RPT_COL_PUBLISHER_BID_STRATEGY = 'Publisher Bid Strategy'
RPT_COL_PUBLISHER_BID_STRATEGY_LEVEL = 'Publisher Bid Strategy Level'
RPT_COL_PUBLISHER_NAME = 'Publisher Name'
RPT_COL_PUBLISHER_TARGET_IS = 'Publisher Target IS %'
RPT_COL_PUBLISHER_TARGET_IS_TYPE = 'Publisher Target IS Type'
RPT_COL_PUBLISHER_TARGET_ROAS = 'Publisher Target ROAS'
RPT_COL_PUBLISHER_TARGET_CPA = 'Publisher Target CPA'
RPT_COL_COUNTRY_DIM = 'Country_Dim'
RPT_COL_FRIENDLYNAME_DIM = 'FriendlyName_Dim'
RPT_COL_LINE_OF_BUSINESS_DIM = 'Line of Business_Dim'
RPT_COL_PROGRAM_DIM = 'Program_Dim'
RPT_COL_CAMPAIGNID = 'Campaign ID'
# output columns and initial values
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_PROGRAM_DIM = 'Program_Dim'
BULK_COL_CAMPAIGNID = 'Campaign ID'
# Program code to full name mapping
PROGRAM_MAPPING = {
'EMX': 'EmeraldEMX',
'CBUS': 'Client Briefings',
'SWPR': 'Solution Webinars',
'PR': 'PR',
'SI': 'Social Influencer',
'AR': 'AR',
'SAL': 'Sales',
'MVERT': 'Merchant Verticals',
'PTNR': 'Partner',
'OB': 'On-boarding',
'CCM': 'Client Communication Merchant',
'RTNK': 'Rethink',
'MENS': 'Merchant Enterprise New Sales',
'MECS': 'Merchant Enterprise Cross Sales',
'MEA': 'Merchant Enterprise All',
'MNS': 'Merchant SMB New Sales',
'PM': 'Paid Media'
}
# Function to extract value between second and third underscore and map to full name
def extract_and_map_program(campaign_name):
try:
parts = campaign_name.split('_')
if len(parts) >= 3:
program_code = parts[2]
# Return mapped full name if exists, otherwise return original code
return PROGRAM_MAPPING.get(program_code, program_code)
return None # Return None if format doesn't match
except:
return None # Return None for any errors
# Create output dataframe with required columns
outputDf = inputDf[[RPT_COL_ACCOUNT, RPT_COL_CAMPAIGN,RPT_COL_CAMPAIGNID]].copy()
outputDf.columns = [BULK_COL_ACCOUNT, BULK_COL_CAMPAIGN,BULK_COL_CAMPAIGNID]
# Apply the extraction and mapping function
outputDf[BULK_COL_PROGRAM_DIM] = outputDf[BULK_COL_CAMPAIGN].apply(extract_and_map_program)
# Remove rows where program extraction failed (where value is None)
outputDf = outputDf.dropna(subset=[BULK_COL_PROGRAM_DIM])
# Debug print first few rows
print(tableize(inputDf.head()))
print("\nOutput DataFrame:")
print(tableize(outputDf.head()))
Post generated on 2025-03-11 01:25:51 GMT